Re: 4 hashes parallel on SSE2 CPUs for 0.3.6

Participants: tcatm

Performance of stock code (as measured by my test/benchmark program) is about 1500khash/s. My code does 3500khash/s. Both figures are for one core. It scales well because I do 128 hashes at once and keep the datastructures small enough to fit in the CPU cache.

I have two local collision attacks which will squeeze another 300khash/s out, but they are not stable yet.