I added a little fix after I made the binary.. It will try to leave one core unused so that it doesn’t bog down. I have a MacBook with a 9400M as well and since the GPU is so slow (only about 500k/sec) it doesn’t really lose that much if there’s 2 CPU threads running. But on my desktop with an 8800 (about 3500k/sec) it drops dramatically if I don’t leave one of the cores free. So really on the MacBook it might be better to just run 2 CPU threads anyway, maybe I’ll add an option for tweaking.. this is all just a prototype right now.
I’m getting about 950k/sec total with 9400M + one CPU thread on the MacBook I have. It seemed like i was getting about 1000-1100k/sec with 2 CPU threads + 9400M. About 700-800k/sec with just the CPUs.