wiki:PrimBool

Version 8 (modified by jstolarek, 15 months ago) (diff)

--

Implementing primitive Bool#

This page gathers the notes about implementing primitive logical operations and thus resolving ticket #6135.

The problem

Consider following fragment of code:

 case (x <# 0#) || (x >=# width) || (y <# 0#) || (y >=# height) of
  True  -> E1
  False -> E2

This kind of code is common in image processing (and array programming in general) where one needs to check whether the (x,y) coordinates are within the image. Primitive comparison operators <# and >=# have type Int# -> Int# -> Bool. Logical OR operator (||) is defined as:

(||)       :: Bool -> Bool -> Bool
True  || _ =  True
False || x =  x

in GHC.Classes (ghc-prim library) which is equivalent of:

(||) x y = case x of
            True  -> True
            False -> y

During the compilation process (assuming the optimizations are turned on) the definition of (||) gets inlined and then case-of-case transform is performed successively. This results in following Core (cleaned up for clarity):

case <# x 0 of _ {
  False ->
    case >=# x width of _ {
      False ->
        case <# y 0 of _ {
          False ->
            case >=# y height of _ {
              False -> E2
              True  -> E1
            };
          True -> E1
        };
      True -> E1
    };
  True -> E1
};

and in following assembler code:

.Lc1rf:
        testq %r14,%r14
        jl .Lc1rk
        cmpq %rdi,%r14
        jge .Lc1rp
        testq %rsi,%rsi
        jl .Lc1ru
        cmpq %r8,%rsi
        jge .Lc1rz
        movl $Main_g2_closure+1,%ebx
        jmp *0(%rbp)
.Lc1rk:
        movl $Main_g1_closure+1,%ebx
        jmp *0(%rbp)
.Lc1rp:
        movl $Main_g1_closure+1,%ebx
        jmp *0(%rbp)
.Lc1ru:
        movl $Main_g1_closure+1,%ebx
        jmp *0(%rbp)
.Lc1rz:
        movl $Main_g1_closure+1,%ebx
        jmp *0(%rbp)

There are five possible branches to take, although four of them have the same result. This is caused by code duplication introduced by case-of-case transform (see this blog post for a step by step derivation). According to Ben Lippmeier, who submitted the original bug report, mis-predicted branches are bad in object code because they stall the pipeline.

Workarounds

It is possible to work around the issue of code duplication by using GHC primops tagToEnum# and dataToTag#. These allow to distinguish between True and False by means of accessing the tag of a data type constructor. This means that dataToTag# can convert True to 1# and False to 0#, while tagToEnum# does the opposite (see paper Faster Laziness Using Dynamic Pointer Tagging for more details):

ghci> import GHC.Exts
ghci> import GHC.Prim
ghci> :set -XMagicHash
ghci> I# (dataToTag# True)
1
ghci> I# (dataToTag# False)
0
ghci> (tagToEnum# 0#) :: Bool
False
ghci> (tagToEnum# 1#) :: Bool
True
ghci>

Having the possibility of converting Bool to an unboxed Int# allows us to compute results of logical expression by means of logical bitwise operations. The result can be converted back to a Bool so this is transparent on the Haskell source level, except for the fact that defined logical binary operators will be strict in both their arguments.

NOTE: Validity of this solution is based on assumption that True will always have a tag of 1#, while False will have a tag of 0#. Changing this invariant in the future would make these primitive logical operators invalid.

First workaround

First workaround assumes converting each result of comparison into an unboxed Int and replacing || with +#:

 case (dataToTag# (x <# 0#)) +# (dataToTag# (x >=# width)) +#
      (dataToTag# (y <# 0#)) +# (dataToTag# (y >=# height)) of
  0# -> E2 -- note that branch order is reversed
  _  -> E1

This compiles to:

case +#
       (+#
          (+# (dataToTag# (<# x 0)) (dataToTag# (>=# x width)))
          (dataToTag# (<# y 0)))
       (dataToTag# (>=# y height))
of _ {
  __DEFAULT -> E1;
  0 -> E2
}

Similarly we can convert logical && into multiplication.

Second workaround

The above workaround is a bit clumsy: dataToTag#s make the code verbose and it may not be very obvious what the code is doing. Hence the second workaround, that defines an alternative logical or operator:

(||#) :: Bool -> Bool -> Bool
(||#) x y = let xW = int2Word# (dataToTag# x)
                yW = int2Word# (dataToTag# y)
                zI = word2Int# (yW `or#` xW)
            in tagToEnum# zI

This operator is defined in terms of primops dataToTag#, tagToEnum# and a bitwise or primop or#. Since the last one operates only on Words we need to use int2Word# and word2Int# for conversion between these data types. Luckily, GHC does a good job of removing unnecessary conversions between data types. This means that:

 case (x <# 0#) ||# (x >=# width) ||# (y <# 0#) ||# (y >=# height) of
  True  -> E1
  False -> E2

compiles to:

case tagToEnum#
       (word2Int#
          (or#
             (int2Word# (dataToTag# (>=# y height)))
             (or#
                (int2Word# (dataToTag# (<# y 0)))
                (or#
                   (int2Word# (dataToTag# (>=# x width)))
                   (int2Word# (dataToTag# (<# x 0)))))))
of _ {
  False -> E2;
  True  -> E1
}

Primitive logical operators &&# and not# can be defined in a similar matter.

Solutions

It seems that the best solution to the problem would be implementing second of the above workarounds as a separate primop. Alternatively we can implement primitive bitwise or, and and xor that work on Ints instead of Words and define new logical operators using bitwise operators in one of the libraries.

Places of interest in the source code

The file prelude/primops.txt.pp defines PrimOps and their type signatures.

Attachments (1)

Download all attachments as: .zip