Opened 3 years ago

Last modified 2 months ago

#7109 infoneeded bug

Inlining depends on datatype size, even with INLINE pragmas

Reported by: dreixel Owned by: simonpj
Priority: normal Milestone: 7.12.1
Component: Compiler Version: 7.5
Keywords: Cc: nfrisby
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

Consider the following code:

data Logic = T | F
           | Not Logic

instance GEq Logic

testEqLogic = geq (Not T) (Not F)

With a proper definitions of generic equality geq in class GEq, and an instance Generic Logic, we get the following core code with -O1:

Rec {
Bug.$fGEqLogic_$cgeq [Occ=LoopBreaker]
  :: Bug.Logic -> Bug.Logic -> GHC.Types.Bool
[GblId, Arity=2, Caf=NoCafRefs, Str=DmdType SS]
Bug.$fGEqLogic_$cgeq =
  \ (x_ap1 :: Bug.Logic) (y_ap2 :: Bug.Logic) ->
    case x_ap1 of _ {
      Bug.T ->
        case y_ap2 of _ {
          Bug.T -> GHC.Types.True;
          Bug.F -> GHC.Types.False;
          Bug.Not g1_aBc_ayJ -> GHC.Types.False
        };
      Bug.F ->
        case y_ap2 of _ {
          Bug.T -> GHC.Types.False;
          Bug.F -> GHC.Types.True;
          Bug.Not g1_aBc_ayJ -> GHC.Types.False
        };
      Bug.Not g1_aBc_ayJ ->
        case y_ap2 of _ {
          __DEFAULT -> GHC.Types.False;
          Bug.Not g1_aBc1_XAu -> Bug.$fGEqLogic_$cgeq g1_aBc_ayJ g1_aBc1_XAu
        }
    }
end Rec }

Nice and simple, looking just like what we would expect for an equality function for datatype Logic.

Now we add one more constructor to datatype Logic (and adapt the Generic instance accordingly):

data Logic = T | F
           | Not Logic 
           | And Logic Logic

GHC (HEAD) now generates 3000 lines of core code for the Bug.$fGEqLogic_$cgeq function, instead of something only slightly longer than above.

Why is this? The second version of our Logic datatype is as easy to optimise as the first version; only the terms involved will be slightly longer. Attached file Bug2.hs is the input which gives the correct behaviour, while Bug3.hs is the input with one added constructor. (You might wonder if it has to do with the fact that the added constructor has more than one argument, but this is not the source of the problem.) Both files have INLINE pragmas pretty much everywhere (in fact, we're not deriving Generic so that we can put INLINE pragmas on to and from).

Attachments (2)

Bug2.hs (2.3 KB) - added by dreixel 3 years ago.
Bug3.hs (2.7 KB) - added by dreixel 3 years ago.

Download all attachments as: .zip

Change History (9)

Changed 3 years ago by dreixel

Changed 3 years ago by dreixel

comment:1 Changed 3 years ago by nfrisby

  • Cc nfrisby added

comment:2 Changed 2 years ago by igloo

  • difficulty set to Unknown
  • Milestone set to 7.8.1
  • Owner set to simonpj

Simon, is some sort of heuristic coming into play?

comment:3 Changed 10 months ago by thoughtpolice

  • Milestone changed from 7.8.3 to 7.10.1

Moving to 7.10.1.

comment:4 follow-up: Changed 3 months ago by thomie

  • Status changed from new to infoneeded

The function $fGEqLogic_$cgeq from the output of ghc -fforce-recomp -ddump-simpl -dsuppress-all -O1 Bug3.hs with ghc-7.9.20141121 (HEAD):

Rec {
$fGEqLogic_$cgeq
$fGEqLogic_$cgeq =
  \ x_aYx y_aYy ->
    let {
      $j_s1Y5
      $j_s1Y5 =
        \ a9_a1kB ->
          case y_aYy of _ {
            __DEFAULT -> False;
            Not g1_aBh_a1kM ->
              case a9_a1kB of _ {
                L1 a10_a1kz -> $fGEqLogic_$cgeq (a10_a1kz `cast` ...) g1_aBh_a1kM;
                R1 a10_X1o5 -> False
              };
            And g1_aBi_a1kN g2_aBj_a1kO ->
              case a9_a1kB of _ {
                L1 a10_a1kz -> False;
                R1 a10_X1o5 ->
                  case a10_X1o5 `cast` ... of _ { :*: a11_a171 b1_a172 ->
                  case $fGEqLogic_$cgeq (a11_a171 `cast` ...) g1_aBi_a1kN of _ {
                    False -> False;
                    True -> $fGEqLogic_$cgeq (b1_a172 `cast` ...) g2_aBj_a1kO
                  }
                  }
              }
          } } in
    case x_aYx of _ {
      T ->
        case y_aYy of _ {
          T -> True;
          F -> False;
          Not g1_aBh_a1kM -> False;
          And g1_aBi_a1kN g2_aBj_a1kO -> False
        };
      F ->
        case y_aYy of _ {
          __DEFAULT -> False;
          F -> True
        };
      Not g1_aBh_a1kM -> $j_s1Y5 (L1 (g1_aBh_a1kM `cast` ...));
      And g1_aBi_a1kN g2_aBj_a1kO ->
        $j_s1Y5
          (R1
             ((:*: (g1_aBi_a1kN `cast` ...) (g2_aBj_a1kO `cast` ...))
              `cast` ...))
    }
end Rec }

Pedro: is that sufficiently small, or do you still think there is a bug somewhere?

comment:5 in reply to: ↑ 4 ; follow-up: Changed 3 months ago by dreixel

Replying to thomie:

Pedro: is that sufficiently small, or do you still think there is a bug somewhere?

I think there's still a problem. Let's look at the code for Bug2.hs, which simply has one fewer constructor:

$fGEqLogic_$cgeq
$fGEqLogic_$cgeq =
  \ x_aYs y_aYt ->
    case x_aYs of _ {
      T ->
        case y_aYt of _ {
          T -> True;
          F -> False;
          Not g1_aBc_a1kH -> False
        };
      F ->
        case y_aYt of _ {
          __DEFAULT -> False;
          F -> True
        };
      Not g1_aBc_a1kH ->
        case y_aYt of _ {
          __DEFAULT -> False;
          Not g1_aBc1_X1mN -> $fGEqLogic_$cgeq g1_aBc_a1kH g1_aBc1_X1mN
        }
    }

This looks good. No casts, no L/R, just a perfectly specialised function.

In contrast, Bug3 (which simply adds one constructor) has generic representation leftovers and casts all over the place. It still feels like something is behaving significantly different just because the datatype got slightly bigger, and I don't know how to tell GHC that it shouldn't be afraid of inlining stuff just because the terms are bigger.

comment:6 in reply to: ↑ 5 Changed 3 months ago by simonpj

Replying to dreixel:

I don't know how to tell GHC that it shouldn't be afraid of inlining stuff just because the terms are bigger.

Well that's precisely what INLINE pragmas are for. But you know that, so you must mean something else.

Pedro, can you be more precise/specific about how to reproduce the problem you are concerned about (perhaps ghc -O Bug3.hs -ddump-simpl? I assume you are going to say "I put an INLINE pragma here, but it doesn't work" or something like that. Fine! The more specific you can be, the better chance I have to understand and help.

Simon

comment:7 Changed 2 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

Note: See TracTickets for help on using tickets.