Interesting detail: measured timings depend on the order of the tests
in the sequence. Try switching frames 1 and 3, the results will look
different. Maybe its better to have all implementations in the same
frame (case structure with enum or so).
Anyway, on my system DIV and AND are very close in time consumption.
AND tends to be a bit faster (and its definitely the nicer solution).
Hi Tomi, I noticed this too, that conversion from I32 <--> U32 is just a typecast. When you convert a -1 to U32, it's 4 Billion-something (2^32 -1) and it is not truncated to zero. But if you convert a floating point value (e.g. a Sgl, it's stored in 32 Bit too). See the attached pictures.