Quote from: sgtstein on August 13, 2010, 11:17:51 PM
- Do we know why it doesn’t work on 32bit? Is is it because it’s using 128bits and if so, would it help if we dropped it to 64?
No idea, maybe some alignment problem. Someone was trying to figure it out on IRC. I don’t have a SSE2 capable 32bit system. The additional registers in 64bit mode are also useful. I don’t know if your PE2650 has a recent enough CPU. You might experience a performance drop of 50% if the CPU is too old.
Btw, did anyone with Intel CPU compare performance with Hyperthreading enabled/disabled? The SSE2 loop keeps the arithmetic units and pipelines pretty busy and I can imagine Hyperthreading might decrease performance.