Implement new `clz` inline primop
From a draft for #9281 (closed), I had to use a C FFI and using a gcc/clang __builtin
to get access to the CLZ instruction:
-- | Compute base-2 log of 'Word#'
--
-- This is internally implemented as count-leading-zeros machine instruction.
foreign import ccall unsafe "integer_gmp_word_log2"
wordLog2# :: Word# -> Int#
HsWord
integer_gmp_word_log2(HsWord x)
{
#if (SIZEOF_HSWORD) == (SIZEOF_INT)
return x ? (WORD_SIZE_IN_BITS-1) - __builtin_clz(x) : -1;
#elif (SIZEOF_HSWORD) == (SIZEOF_LONG)
return x ? (WORD_SIZE_IN_BITS-1) - __builtin_clzl(x) : -1;
#elif (SIZEOF_HSWORD) == (SIZEOF_LONG_LONG)
return x ? (WORD_SIZE_IN_BITS-1) - __builtin_clzll(x) : -1;
#else
# error unsupported SIZEOF_HSWORD
#endif
}
Since a clz
-like operation should be available on most cpus GHC should expose that as a primop (clz# :: Word# -> Int#
).