Version 8 (modified by jstolarek, 4 years ago) (diff) |
---|
Implementing primitive Bool#
This page gathers the notes about implementing primitive logical operations and thus resolving ticket #6135.
The problem
Consider following fragment of code:
case (x <# 0#) || (x >=# width) || (y <# 0#) || (y >=# height) of True -> E1 False -> E2
This kind of code is common in image processing (and array programming in general) where one needs to check whether the (x,y) coordinates are within the image. Primitive comparison operators <# and >=# have type Int# -> Int# -> Bool. Logical OR operator (||) is defined as:
(||) :: Bool -> Bool -> Bool True || _ = True False || x = x
in GHC.Classes (ghc-prim library) which is equivalent of:
(||) x y = case x of True -> True False -> y
During the compilation process (assuming the optimizations are turned on) the definition of (||) gets inlined and then case-of-case transform is performed successively. This results in following Core (cleaned up for clarity):
case <# x 0 of _ { False -> case >=# x width of _ { False -> case <# y 0 of _ { False -> case >=# y height of _ { False -> E2 True -> E1 }; True -> E1 }; True -> E1 }; True -> E1 };
and in following assembler code:
.Lc1rf: testq %r14,%r14 jl .Lc1rk cmpq %rdi,%r14 jge .Lc1rp testq %rsi,%rsi jl .Lc1ru cmpq %r8,%rsi jge .Lc1rz movl $Main_g2_closure+1,%ebx jmp *0(%rbp) .Lc1rk: movl $Main_g1_closure+1,%ebx jmp *0(%rbp) .Lc1rp: movl $Main_g1_closure+1,%ebx jmp *0(%rbp) .Lc1ru: movl $Main_g1_closure+1,%ebx jmp *0(%rbp) .Lc1rz: movl $Main_g1_closure+1,%ebx jmp *0(%rbp)
There are five possible branches to take, although four of them have the same result. This is caused by code duplication introduced by case-of-case transform (see this blog post for a step by step derivation). According to Ben Lippmeier, who submitted the original bug report, mis-predicted branches are bad in object code because they stall the pipeline.
Workarounds
It is possible to work around the issue of code duplication by using GHC primops tagToEnum# and dataToTag#. These allow to distinguish between True and False by means of accessing the tag of a data type constructor. This means that dataToTag# can convert True to 1# and False to 0#, while tagToEnum# does the opposite (see paper Faster Laziness Using Dynamic Pointer Tagging for more details):
ghci> import GHC.Exts ghci> import GHC.Prim ghci> :set -XMagicHash ghci> I# (dataToTag# True) 1 ghci> I# (dataToTag# False) 0 ghci> (tagToEnum# 0#) :: Bool False ghci> (tagToEnum# 1#) :: Bool True ghci>
Having the possibility of converting Bool to an unboxed Int# allows us to compute results of logical expression by means of logical bitwise operations. The result can be converted back to a Bool so this is transparent on the Haskell source level, except for the fact that defined logical binary operators will be strict in both their arguments.
NOTE: Validity of this solution is based on assumption that True will always have a tag of 1#, while False will have a tag of 0#. Changing this invariant in the future would make these primitive logical operators invalid.
First workaround
First workaround assumes converting each result of comparison into an unboxed Int and replacing || with +#:
case (dataToTag# (x <# 0#)) +# (dataToTag# (x >=# width)) +# (dataToTag# (y <# 0#)) +# (dataToTag# (y >=# height)) of 0# -> E2 -- note that branch order is reversed _ -> E1
This compiles to:
case +# (+# (+# (dataToTag# (<# x 0)) (dataToTag# (>=# x width))) (dataToTag# (<# y 0))) (dataToTag# (>=# y height)) of _ { __DEFAULT -> E1; 0 -> E2 }
Similarly we can convert logical && into multiplication.
Second workaround
The above workaround is a bit clumsy: dataToTag#s make the code verbose and it may not be very obvious what the code is doing. Hence the second workaround, that defines an alternative logical or operator:
(||#) :: Bool -> Bool -> Bool (||#) x y = let xW = int2Word# (dataToTag# x) yW = int2Word# (dataToTag# y) zI = word2Int# (yW `or#` xW) in tagToEnum# zI
This operator is defined in terms of primops dataToTag#, tagToEnum# and a bitwise or primop or#. Since the last one operates only on Words we need to use int2Word# and word2Int# for conversion between these data types. Luckily, GHC does a good job of removing unnecessary conversions between data types. This means that:
case (x <# 0#) ||# (x >=# width) ||# (y <# 0#) ||# (y >=# height) of True -> E1 False -> E2
compiles to:
case tagToEnum# (word2Int# (or# (int2Word# (dataToTag# (>=# y height))) (or# (int2Word# (dataToTag# (<# y 0))) (or# (int2Word# (dataToTag# (>=# x width))) (int2Word# (dataToTag# (<# x 0))))))) of _ { False -> E2; True -> E1 }
Primitive logical operators &&# and not# can be defined in a similar matter.
Solutions
It seems that the best solution to the problem would be implementing second of the above workarounds as a separate primop. Alternatively we can implement primitive bitwise or, and and xor that work on Ints instead of Words and define new logical operators using bitwise operators in one of the libraries.
Places of interest in the source code
The file prelude/primops.txt.pp defines PrimOps and their type signatures.
Attachments (1)
- prim-bool-criterion.png (107.6 KB) - added by jstolarek 3 years ago.
Download all attachments as: .zip