Changes between Version 13 and Version 14 of PrimBool


Ignore:
Timestamp:
Apr 13, 2013 9:37:24 AM (12 months ago)
Author:
jstolarek
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • PrimBool

    v13 v14  
    11= Implementing primitive Bool# = 
    22 
    3 This page gathers the notes about implementing primitive logical operations and thus resolving ticket #6135. 
     3This page gathers the notes about implementing new primitive logical operations and thus resolving ticket #6135. 
    44 
    55== The problem == 
     
    7777}}} 
    7878 
    79 There are five possible branches to take, although four of them have the same result. This is caused by code duplication introduced by case-of-case transform (see [http://ics.p.lodz.pl/~stolarek/blog/2013/01/taking-magic-out-of-ghc-or-tracing-compilation-by-transformation/ this blog post] for a step by step derivation). According to Ben Lippmeier, who submitted the original bug report, mis-predicted branches are bad in object code because they stall the pipeline. 
    80  
    81 == Workarounds == 
    82  
    83 It is possible to work around the issue of code duplication by using GHC primops `tagToEnum#` and `dataToTag#`. These allow to distinguish between `True` and `False` by means of accessing the tag of a data type constructor. This means that `dataToTag#` can convert `True` to `1#` and `False` to `0#`, while `tagToEnum#` does the opposite (see paper [http://research.microsoft.com/en-us/um/people/simonpj/papers/ptr-tag/index.htm "Faster Laziness Using Dynamic Pointer Tagging"] for more details):  
    84  
    85 {{{ 
    86 ghci> import GHC.Exts 
    87 ghci> import GHC.Prim 
    88 ghci> :set -XMagicHash 
    89 ghci> I# (dataToTag# True) 
    90 1 
    91 ghci> I# (dataToTag# False) 
    92 0 
    93 ghci> (tagToEnum# 0#) :: Bool 
    94 False 
    95 ghci> (tagToEnum# 1#) :: Bool 
    96 True 
    97 ghci> 
    98 }}} 
    99  
    100 Having the possibility of converting `Bool` to an unboxed `Int#` allows us to compute results of logical expression by means of logical bitwise operations. The result can be converted back to a `Bool` so this is transparent on the Haskell source level, except for the fact that defined logical binary operators will be strict in both their arguments.  
    101  
    102 '''NOTE: Validity of this solution is based on assumption that `True` will always have a tag of `1#`, while `False` will have a tag of `0#`. Changing this invariant in the future would make these primitive logical operators invalid.''' 
    103  
    104 === First workaround === 
    105  
    106 First workaround assumes converting each result of comparison into an unboxed `Int` and replacing `||` with `+#`: 
    107  
    108 {{{ 
    109  case (dataToTag# (x <# 0#)) +# (dataToTag# (x >=# width)) +# 
    110       (dataToTag# (y <# 0#)) +# (dataToTag# (y >=# height)) of 
    111   0# -> E2 -- note that branch order is reversed 
    112   _  -> E1 
    113 }}} 
    114  
    115 This compiles to: 
    116  
    117 {{{ 
    118 case +# 
    119        (+# 
    120           (+# (dataToTag# (<# x 0)) (dataToTag# (>=# x width))) 
    121           (dataToTag# (<# y 0))) 
    122        (dataToTag# (>=# y height)) 
    123 of _ { 
    124   __DEFAULT -> E1; 
    125   0 -> E2 
    126 } 
    127 }}} 
    128  
    129 Similarly we can convert logical && into multiplication. 
    130  
    131 === Second workaround === 
    132  
    133 The above workaround is a bit clumsy: `dataToTag#`s make the code verbose and it may not be very obvious what the code is doing. Hence the second workaround, that defines an alternative logical `or` operator: 
    134  
    135 {{{ 
    136 (||#) :: Bool -> Bool -> Bool 
    137 (||#) x y = let xW = int2Word# (dataToTag# x) 
    138                 yW = int2Word# (dataToTag# y) 
    139                 zI = word2Int# (yW `or#` xW) 
    140             in tagToEnum# zI 
    141 }}} 
    142  
    143 This operator is defined in terms of primops `dataToTag#`, `tagToEnum#` and a bitwise or primop `or#`. Since the last one operates only on `Word`s we need to use `int2Word#` and `word2Int#` for conversion between these data types. Luckily, GHC does a good job of removing unnecessary conversions between data types. This means that: 
    144  
    145 {{{ 
    146  case (x <# 0#) ||# (x >=# width) ||# (y <# 0#) ||# (y >=# height) of 
    147   True  -> E1 
    148   False -> E2 
    149 }}} 
    150  
    151 compiles to: 
    152  
    153 {{{ 
    154 case tagToEnum# 
    155        (word2Int# 
    156           (or# 
    157              (int2Word# (dataToTag# (>=# y height))) 
    158              (or# 
    159                 (int2Word# (dataToTag# (<# y 0))) 
    160                 (or# 
    161                    (int2Word# (dataToTag# (>=# x width))) 
    162                    (int2Word# (dataToTag# (<# x 0))))))) 
    163 of _ { 
    164   False -> E2; 
    165   True  -> E1 
    166 } 
    167 }}} 
    168  
    169 Primitive logical operators `&&#` and `not#` can be defined in a similar matter. 
    170  
    171 '''NOTE: Neither of this two workarounds produces good object code. The reason is that comparison operators return a `Bool` as a thunk that needs to be evaluated. The real solution requires that no thunk is created.''' 
    172  
    173 == Solutions == 
    174  
    175 A good beginning would be to implementing second of the above workarounds as a primop. Then we need to create primops that return unboxed values instead of a thunk. The big question is should an unboxed version of Bool be introduced into the language? 
    176 === First approach === 
    177  
    178 Treat `Bool` as a boxed version of primitive `Bool#`. `True` would be equivalent of `B# True#`, `False` of `B# False#`: 
    179   {{{ 
    180 data Bool = B# True# | B# False# 
    181  
    182 -- B# :: Bool# -> Bool 
    183 }}} 
    184  
    185 Not sure if this can be considered equivalent to what the Haskell Report says about Bool. We need to ensure that `Bool#` is populated only by `True#` and `False#` and that these two are translated to `1#` and `0#` in the Core. It should be **impossible** to write such a function at Haskell level: 
    186  
    187 {{{ 
    188 g :: Bool -> Int -> Int 
    189 g (B# b) (I# i) = I# (b + i) 
    190 }}} 
    191  
    192 This approach might require one additional case expression to inspect the value of `Bool` at the Core level. For example: 
    193  
    194 {{{ 
    195 f :: Int -> Int -> Int 
    196 f x y = if x > y 
    197         then x 
    198         else y 
    199 }}} 
    200  
    201 would compile to: 
    202  
    203 {{{ 
    204 case x of _ { I# xP -> 
    205  case y of _ { I# yP -> 
    206    case ># xP yP of _ { 
    207      B# bP -> case bP of _ { 1# -> e1; 0# -> e2 } 
    208    } 
    209  } 
    210 } 
    211 }}} 
    212  
    213 This would complicate Core a bit but it should be possible to compile such Core to exactly the same result as with normal `Bool`. This code assumes that `>#` has type `Int# -> Int# -> Bool`, but to truly avoid branching in the Core we need `.># :: Int# -> Int# -> Bool#` so that we get a primitive value that doesn't need to be inspected using case expression but can be directly used by primitive logical operators. 
    214  
    215 === Second approach === 
    216  
    217 Second approach assumes creating type `Bool#` that is independent of type `Bool`. Boxing and unboxing would have to be done explicitly via additional functions: 
    218  
    219 {{{ 
    220 data Bool = True | False -- no changes here 
    221  
    222 bBox :: Bool# -> Bool 
    223 bBox 1# = True 
    224 bBox 0# = False 
    225  
    226 bUnbox :: Bool -> Bool# 
    227 bUnbox True  = 1# 
    228 bUnbox False = 0# 
    229 }}} 
    230  
    231 `Bool#` could not be implemented as an ADT because it is unlifted and unboxed, while ADT value constructors need to be boxed and lifted (see comments in [[GhcFile(compiler/types/TyCon.lhs)]]). There would need to be some magical way of ensuring that `Bool#` is populated only by `#0` and `1#` and that these values cannot be mixed with unboxed integers. Perhaps this could be done by preventing programmer from explicitly creating values of that type (can this be done?) and allow her only to use values returned from functions. 
    232  
    233 Another problem with this approach is that it would introduce primitive logical operations `||#` and `&&#` with type `Int# -> Int# -> Int#` - it is questionable whether anyone would want such operations available to the programmer. I think it is desirable to have primitive logical operators of type `Bool# -> Bool# -> Bool#`. 
    234  
    235 == Proposed patch (13/03/2013) == 
    236  
    237 The prototype patch posted on the trac implements six new prototype comparison primops: 
     79There are five possible branches to take, although four of them have the same result. This is caused by code duplication introduced by case-of-case transform (see [http://lambda.jstolarek.com/2013/01/taking-magic-out-of-ghc-or-tracing-compilation-by-transformation/ this blog post] for a step by step derivation). According to Ben Lippmeier, who submitted the original bug report, mis-predicted branches are bad in object code because they stall the pipeline. 
     80 
     81== Solution == 
     82 
     83The idea behind the solution is to modify comparison primops to return unboxed unlifted `Int#` instead of `Bool` (which is lifted and thus is returned as a thunk that needs to be evaluated). This will be implemented in the following way: 
     84 
     85  * existing comparison primops will have their return type changed to `Int#`. Also, their names will be changed. Operators will have `$` added before `#`, others will have `I` added before the `#` (this is a mnemonic denoting that this primop returns and `Int#`). Examples: 
     86 
     87{{{ 
     88>=$#      :: Int#    -> Int#    -> Int# 
     89/=$##     :: Double# -> Double# -> Int# 
     90gtCharI#  :: Char#   -> Char#   -> Int# 
     91eqWordI#  :: Word#   -> Word#   -> Int# 
     92ltFloatI# :: Float#  -> Float#  -> Int# 
     93leAddrI#  :: Addr#   -> Addr#   -> Int# 
     94}}} 
     95 
     96  * a new module `GHC.PrimWrappers` will be added to ghc-prim library. This module will contain wrappers for comparison primops. These wrappers will have names identical to removed primops and will return a `Bool`. Examples: 
     97 
     98{{{ 
     99gtChar# :: Char# -> Char# -> Bool 
     100gtChar# a b = tagToEnum# (a `gtCharI#` b) 
     101 
     102(>=#) :: Int# -> Int# -> Bool 
     103(>=#) a b = tagToEnum# (a >=$# b) 
     104 
     105eqWord# :: Word# -> Word# -> Bool 
     106eqWord# a b = tagToEnum# (a `eqWordI#` b) 
     107 
     108(/=##) :: Double# -> Double# -> Bool 
     109(/=##) a b = tagToEnum# (a /=$## b) 
     110 
     111ltFloat# :: Float# -> Float# -> Bool 
     112ltFloat# a b = tagToEnum# (a `ltFloatI#` b) 
     113 
     114leAddr# :: Addr# -> Addr# -> Bool 
     115leAddr# a b = tagToEnum# (a `leAddrI#` b) 
     116}}} 
     117 
     118Thanks to these wrappers the change will be almost backwards compatible. The only thing primop users will need to change in their existing code to make it work again is adding import of !GHC.PrimWrappers module. 
     119 
     120  * The following boot libraries require modification in order to work with the new primops: base, ghc-prim and integer-gmp. The only required modifications are imports of the !GHC.PrimWrappers module in modules that use the primops. 
     121 
     122== Proof of concept == 
     123 
     124The prototype patch posted on the trac on 13th of March implemented six new prototype comparison primops: 
    238125 
    239126{{{ 
     
    266153{{{ 
    267154# BB#0:                                 # %c1nK 
    268         movq    %rsi, %rax 
    269         orq     %r14, %rax 
    270         shrq    $63, %rax 
    271         cmpq    %rdi, %r14 
    272         setge   %cl 
    273         movzbl  %cl, %ecx 
    274         orq     %rax, %rcx 
    275         cmpq    %r8, %rsi 
    276         setge   %al 
    277         movzbl  %al, %eax 
    278         orq     %rcx, %rax 
    279         jne     .LBB2_1 
     155  movq  %rsi, %rax 
     156  orq %r14, %rax 
     157  shrq  $63, %rax 
     158  cmpq  %rdi, %r14 
     159  setge %cl 
     160  movzbl  %cl, %ecx 
     161  orq %rax, %rcx 
     162  cmpq  %r8, %rsi 
     163  setge %al 
     164  movzbl  %al, %eax 
     165  orq %rcx, %rax 
     166  jne .LBB2_1 
    280167# BB#3:                                 # %c1ol 
    281         movq    (%rbp), %rax 
    282         movl    $r1m6_closure+1, %ebx 
    283         jmpq    *%rax  # TAILCALL 
     168  movq  (%rbp), %rax 
     169  movl  $r1m6_closure+1, %ebx 
     170  jmpq  *%rax  # TAILCALL 
    284171.LBB2_1:                                # %c1nK 
    285         cmpq    $1, %rax 
    286         jne     .LBB2_2 
     172  cmpq  $1, %rax 
     173  jne .LBB2_2 
    287174# BB#4:                                 # %c1ov 
    288         movq    (%rbp), %rax 
    289         movl    $r1m7_closure+1, %ebx 
    290         jmpq    *%rax  # TAILCALL 
     175  movq  (%rbp), %rax 
     176  movl  $r1m7_closure+1, %ebx 
     177  jmpq  *%rax  # TAILCALL 
    291178.LBB2_2:                                # %c1ob 
    292         movq    r1m5_closure(%rip), %rax 
    293         movl    $r1m5_closure, %ebx 
    294         jmpq    *%rax  # TAILCALL 
     179  movq  r1m5_closure(%rip), %rax 
     180  movl  $r1m5_closure, %ebx 
     181  jmpq  *%rax  # TAILCALL 
    295182}}} 
    296183 
    297184The assembly does not contain comparisons and jumps in the scrutinee of the case expression, but still does jumps for selecting an appropriate branch of the case expression. 
    298  
    299 An alternative design decision is to make the new primops return a `Word#`. I decided to use `Int#` because `Int` is used more often than `Word`. If new primops returned a `Word#` the user would have to use `int2Word#`/`word2Int#` primops to do conversions if she ever wished to box the result. 
    300  
    301 Comparisons for `Word#`, `Float#` and `Double#` will be implemented once we make sure that the prototype implementation is correct. 
    302  
    303 Some concerns: 
    304   - should the primops return an `Int#` or `Word#` as their result? 
    305   - what names should the new primops have? I planned to use names with a dot preceeding the operator for `Int#` and `Double#` comparisons (e.g. `./=#`, `.>##`) and names with "St" suffix for `Word#` and `Float#`, e.g. `gtWordSt#`, `gtFloatSt#` (`St` stands for 'strict' because the result can be used with the strict bitwise logical operators). 
    306   - how to remove the old `Compare` primops (ones of type `T -> T -> Bool`)? 
    307   - once we have the new primops do we really care about the unboxed `Bool#`? 
    308185 
    309186=== Benchmarks for the proposed patch ===