Changes between Version 2 and Version 3 of PrimBool


Ignore:
Timestamp:
Jan 25, 2013 10:02:14 AM (15 months ago)
Author:
jstolarek
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • PrimBool

    v2 v3  
    11= Implementing primitive Bool# = 
    22 
    3 This page gathers the notes about implementing `Bool` as a primitive data type and thus resolving ticket #6135. The motivation is to have primitive logical operators AND, OR and NOT that are strict in all arguments and compute their result by means of primitive bitwise operations. This would eliminate need for branching present in current implementation of logical operators: 
     3This page gathers the notes about implementing primitive logical operations and thus resolving ticket #6135. 
     4 
     5== The problem == 
     6 
     7Consider following fragment of code: 
    48 
    59{{{ 
    6 (||) x y 
    7  = case x of 
    8         True  -> True 
    9         False -> y 
     10 case (x <# 0#) || (x >=# width) || (y <# 0#) || (y >=# height) of 
     11  True  -> E1 
     12  False -> E2 
    1013}}} 
    1114 
    12 This would prevent code duplication caused by case-of-case transformation when multiple logical operations are chained together (see discussion on ticket #6135 for examples). 
     15This kind of code is common in image processing (and array programming in general) where one needs to check whether the `(x,y)` coordinates are within the image. Primitive comparison operators `<#` and `>=#` have type `Int# -> Int# -> Bool`. Logical OR operator `(||)` is defined as: 
     16 
     17{{{ 
     18(||)       :: Bool -> Bool -> Bool 
     19True  || _ =  True 
     20False || x =  x 
     21}}} 
     22 
     23in GHC.Classes (ghc-prim library) which is equivalent of: 
     24 
     25{{{ 
     26(||) x y = case x of 
     27            True  -> True 
     28            False -> y 
     29}}} 
     30 
     31During the compilation process (assuming the optimizations are turned on) the definition of `(||)` gets inlined and then case-of-case transform is performed succesively. This results in following Core (cleaned up for clarity): 
     32 
     33case <# x 0 of _ { 
     34  False -> 
     35    case >=# x width of _ { 
     36      False -> 
     37        case <# y 0 of _ { 
     38          False -> 
     39            case >=# y height of _ { 
     40              False -> E2 
     41              True  -> E1 
     42            }; 
     43          True -> E1 
     44        }; 
     45      True -> E1 
     46    }; 
     47  True -> E1 
     48}; 
     49 
     50and in following assembler code: 
     51 
     52{{{ 
     53.Lc1rf: 
     54        testq %r14,%r14 
     55        jl .Lc1rk 
     56        cmpq %rdi,%r14 
     57        jge .Lc1rp 
     58        testq %rsi,%rsi 
     59        jl .Lc1ru 
     60        cmpq %r8,%rsi 
     61        jge .Lc1rz 
     62        movl $Main_g2_closure+1,%ebx 
     63        jmp *0(%rbp) 
     64.Lc1rk: 
     65        movl $Main_g1_closure+1,%ebx 
     66        jmp *0(%rbp) 
     67.Lc1rp: 
     68        movl $Main_g1_closure+1,%ebx 
     69        jmp *0(%rbp) 
     70.Lc1ru: 
     71        movl $Main_g1_closure+1,%ebx 
     72        jmp *0(%rbp) 
     73.Lc1rz: 
     74        movl $Main_g1_closure+1,%ebx 
     75        jmp *0(%rbp) 
     76}}} 
     77 
     78There are five possible branches to take, although four of them have the same result. This is caused by code duplication introduced by case-of-case transform. Mis-predicted branches are bad in object code because they stall the pipeline. 
    1379 
    1480== Possible solutions and their consequences == 
    1581 
    16 Below are possible approaches to implement primitive `Bool#` in Haskell source language: 
     82Unnecessary branches could be eliminated by introducing primitive logical operators AND, OR and NOT (they will be denoted as `||#`, `&&#` and `not#` respectively). These operators would be strict and treat their logical parameters as unboxed integers (`1#` for `True` and `0#` for `False`). Result would be produced by using low-level bitwise operations. This means that if logical operators were chained together like in a given example only the final result would have to be inspected and thus only one branch would be necessary. 
    1783 
    1884=== First approach ===