Opened 7 years ago

Last modified 7 months ago

#3755 new task

Improve join point inlining

Reported by: simonpj Owned by:
Priority: low Milestone:
Component: Compiler Version: 6.12.1
Keywords: Inlining Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


This ticket relates to the paper "Optimising generics is easy" Jose writes (about a case where inlining doesn't work well):

I put a minimal source code for this test and resulting Core (with GHC 6.13.20091115) in UpdateInt behaves great: in UpdateInt.core.O1.hs, there are no traces of generic representations in testOne, testTwo, testThree and testFour.

UpdateString, however, does not behave so well. In UpdateString.core.O1.hs, Test.testLogic_updateString still has loads of generic representations.

It's easy to see what is happening. Compile UpdateString (which I've attached to this ticket) with -ddump-simpl, and look at the core. You see stuff like

Rec { $j1_rx32 = \x. <big nested case expression>
    ; f = \y. ....($j1_rx32 <big constructor expression>)
              ---($j1_rx32 <big constructor expression)....

So here the $j (which is a "join point") isn't inlined because it's big, although if it were inlined there would be much goodness because the case expressions would cancel with the explicit constructors.

Why did this happen despite lots of INLINE pragmas? I have not followed all the details, but I'm guessing that if we have, say

	{-# INLINE from #-}
	from = \x. case x of from_alts
	{-# INLINE to #-}
	to = \x. case x of to_alts

and we try to optimize this call:

from (mapT f (to x))

then after inlining from, mapT, and to we'll get

case (case (case x of to_alts) of map_alts) of from_alts

And now the case-of-case transform happens, which creates the join points to avoid duplicating map_alts, from_alts into every branch of to_alts. You may say that we've already said that it's ok to duplicate from (and hence from_alts) but we duplicated it once when we inlined it, and then we forget the origin of the code. And indeed, in the worse case you could get a quadratic blow up; and there are only two functions involved. So I'm not unhappy with that.

However, it does make me wonder whether we could not do a better job on the above Rec {..}. Two thoughts occur.

  1. We could beef up SpecConstr. It doesn't fire at the moment for some reason, even with -O2
  1. If we have
    f = \x. case x of { C1 x11..x1n -> e1; ... Ck xk1 ... xkm -> ek }
    maybe we should worker-wrapper it to
    f1 x1 .. x1n = e1
    fn xk1 .. xkm = en
    f = \x of pi -> fi xi
    and now inline f. The net effect is very similar to the way join points work right now, but it would make it multi-level. In fact, doing this might simplify and generalise the way that join points are currently done, where (rather arbitrarily) we duplicate the outer layer of a single case.


Attachments (1)

UpdateString.hs (6.9 KB) - added by simonpj 7 years ago.
UpdateString example

Download all attachments as: .zip

Change History (14)

Changed 7 years ago by simonpj

Attachment: UpdateString.hs added

UpdateString example

comment:1 Changed 7 years ago by igloo

Milestone: 6.14.1

comment:2 Changed 6 years ago by igloo


comment:3 Changed 6 years ago by igloo


comment:4 Changed 6 years ago by reinerp

I just ran into this when playing around with view patterns, with a significantly smaller minimial example:

{-# LANGUAGE ViewPatterns #-}
module Test1 where

data D1 = A | B | C
data D2 = A' | B' | C'

conv A = A'
conv B = B'
conv C = C'

foo :: (b -> D2) -> b -> a -> [a]
foo f (f -> A') a =  [a]
foo f (f -> B')	a =  reverse [a]
foo f (f -> C')	a = reverse (reverse [a]) -- put some junk in the RHS to make it too big to inline
{-# INLINE foo #-}

h :: D1 -> a -> [a]
h a b = foo conv a b

-- "manually inlined" version
j :: D1 -> a -> [a]
j (conv -> A') a =  [a]
j (conv -> B') a =  reverse [a]
j (conv -> C') a = reverse (reverse [a])

With GHC on -O2, we get this core:

Test1.h =
  \ (@ a_ad5) (a_abW :: Test1.D1) (b_abX :: a_ad5) ->
    let {
      $w$j_sfk :: [a_ad5]
      [LclId, Str=DmdType]
      $w$j_sfk =
        case a_abW of _ {
          Test1.A -> Test1.h1 @ a_ad5;
          Test1.B ->
              @ a_ad5
              (GHC.Types.: @ a_ad5 b_abX (GHC.Types.[] @ a_ad5))
              (GHC.Types.[] @ a_ad5);
          Test1.C ->
              @ a_ad5
                 @ a_ad5
                 (GHC.Types.: @ a_ad5 b_abX (GHC.Types.[] @ a_ad5))
                 (GHC.Types.[] @ a_ad5))
              (GHC.Types.[] @ a_ad5)
        } } in
    case a_abW of _ {
      Test1.A -> GHC.Types.: @ a_ad5 b_abX (GHC.Types.[] @ a_ad5);
      Test1.B -> $w$j_sfk;
      Test1.C -> $w$j_sfk

Interestingly, the "manually inlined" function j has the core we want:

Test1.j =
  \ (@ a_ad7) (ds_ddP :: Test1.D1) (a_abY :: a_ad7) ->
    case ds_ddP of _ {
      Test1.A -> GHC.Types.: @ a_ad7 a_abY (GHC.Types.[] @ a_ad7);
      Test1.B ->
          @ a_ad7
          (GHC.Types.: @ a_ad7 a_abY (GHC.Types.[] @ a_ad7))
          (GHC.Types.[] @ a_ad7);
      Test1.C ->
          @ a_ad7
             @ a_ad7
             (GHC.Types.: @ a_ad7 a_abY (GHC.Types.[] @ a_ad7))
             (GHC.Types.[] @ a_ad7))
          (GHC.Types.[] @ a_ad7)


comment:5 Changed 6 years ago by simonpj

Ah yes, and see #3781 too. Thanks for another good example.

comment:6 Changed 5 years ago by igloo


comment:7 Changed 5 years ago by igloo

Priority: normallow

comment:8 Changed 4 years ago by igloo


comment:9 Changed 3 years ago by thoughtpolice


Moving to 7.10.1.

comment:10 Changed 2 years ago by thoughtpolice


Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:11 Changed 18 months ago by thoughtpolice


Milestone renamed

comment:12 Changed 13 months ago by thomie

Milestone: 8.0.1

comment:13 Changed 7 months ago by mpickering

Keywords: Inlining added
Note: See TracTickets for help on using tickets.