12 | | This would prevent code duplication caused by case-of-case transformation when multiple logical operations are chained together (see discussion on ticket #6135 for examples). |
| 15 | This kind of code is common in image processing (and array programming in general) where one needs to check whether the `(x,y)` coordinates are within the image. Primitive comparison operators `<#` and `>=#` have type `Int# -> Int# -> Bool`. Logical OR operator `(||)` is defined as: |
| 16 | |
| 17 | {{{ |
| 18 | (||) :: Bool -> Bool -> Bool |
| 19 | True || _ = True |
| 20 | False || x = x |
| 21 | }}} |
| 22 | |
| 23 | in GHC.Classes (ghc-prim library) which is equivalent of: |
| 24 | |
| 25 | {{{ |
| 26 | (||) x y = case x of |
| 27 | True -> True |
| 28 | False -> y |
| 29 | }}} |
| 30 | |
| 31 | During the compilation process (assuming the optimizations are turned on) the definition of `(||)` gets inlined and then case-of-case transform is performed succesively. This results in following Core (cleaned up for clarity): |
| 32 | |
| 33 | case <# x 0 of _ { |
| 34 | False -> |
| 35 | case >=# x width of _ { |
| 36 | False -> |
| 37 | case <# y 0 of _ { |
| 38 | False -> |
| 39 | case >=# y height of _ { |
| 40 | False -> E2 |
| 41 | True -> E1 |
| 42 | }; |
| 43 | True -> E1 |
| 44 | }; |
| 45 | True -> E1 |
| 46 | }; |
| 47 | True -> E1 |
| 48 | }; |
| 49 | |
| 50 | and in following assembler code: |
| 51 | |
| 52 | {{{ |
| 53 | .Lc1rf: |
| 54 | testq %r14,%r14 |
| 55 | jl .Lc1rk |
| 56 | cmpq %rdi,%r14 |
| 57 | jge .Lc1rp |
| 58 | testq %rsi,%rsi |
| 59 | jl .Lc1ru |
| 60 | cmpq %r8,%rsi |
| 61 | jge .Lc1rz |
| 62 | movl $Main_g2_closure+1,%ebx |
| 63 | jmp *0(%rbp) |
| 64 | .Lc1rk: |
| 65 | movl $Main_g1_closure+1,%ebx |
| 66 | jmp *0(%rbp) |
| 67 | .Lc1rp: |
| 68 | movl $Main_g1_closure+1,%ebx |
| 69 | jmp *0(%rbp) |
| 70 | .Lc1ru: |
| 71 | movl $Main_g1_closure+1,%ebx |
| 72 | jmp *0(%rbp) |
| 73 | .Lc1rz: |
| 74 | movl $Main_g1_closure+1,%ebx |
| 75 | jmp *0(%rbp) |
| 76 | }}} |
| 77 | |
| 78 | There are five possible branches to take, although four of them have the same result. This is caused by code duplication introduced by case-of-case transform. Mis-predicted branches are bad in object code because they stall the pipeline. |