Opened 4 years ago

Last modified 20 months ago

#3755 new task

Improve join point inlining

Reported by: simonpj Owned by:
Priority: low Milestone: 7.6.2
Component: Compiler Version: 6.12.1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Difficulty:
Test Case: Blocked By:
Blocking: Related Tickets:

Description

This ticket relates to the paper "Optimising generics is easy" http://www.dreixel.net/research/pdf/ogie.pdf. Jose writes (about a case where inlining doesn't work well):

I put a minimal source code for this test and resulting Core
(with GHC 6.13.20091115) in https://subversion.cs.uu.nl/repos/staff.jpm.public/Inline/.
UpdateInt behaves great: in UpdateInt.core.O1.hs, there are no traces of
generic representations in testOne, testTwo, testThree and testFour.

UpdateString, however, does not behave so well. In UpdateString.core.O1.hs,
Test.testLogic_updateString still has loads of generic representations.

It's easy to see what is happening. Compile UpdateString (which I've attached to this ticket) with -ddump-simpl, and look at the core. You see stuff like

Rec { $j1_rx32 = \x. <big nested case expression>
    ; f = \y. ....($j1_rx32 <big constructor expression>)
              ---($j1_rx32 <big constructor expression)....
    }

So here the $j (which is a "join point") isn't inlined because it's big, although if it were inlined there would be much goodness because the case expressions would cancel with the explicit constructors.

Why did this happen despite lots of INLINE pragmas? I have not followed all the details, but I'm guessing that if we have, say

	{-# INLINE from #-}
	from = \x. case x of from_alts
	{-# INLINE to #-}
	to = \x. case x of to_alts

and we try to optimize this call:

from (mapT f (to x))

then after inlining from, mapT, and to we'll get

case (case (case x of to_alts) of map_alts) of from_alts

And now the case-of-case transform happens, which creates the join points to avoid duplicating map_alts, from_alts into every branch of to_alts. You may say that we've already said that it's ok to duplicate from (and hence from_alts) but we duplicated it once when we inlined it, and then we forget the origin of the code. And indeed, in the worse case you could get a quadratic blow up; and there are only two functions involved. So I'm not unhappy with that.

However, it does make me wonder whether we could not do a better job on the above Rec {..}. Two thoughts occur.

  1. We could beef up SpecConstr. It doesn't fire at the moment for some reason, even with -O2
  1. If we have
    f = \x. case x of { C1 x11..x1n -> e1; ... Ck xk1 ... xkm -> ek }
    
    maybe we should worker-wrapper it to
    f1 x1 .. x1n = e1
    ...
    fn xk1 .. xkm = en
    f = \x of pi -> fi xi
    
    and now inline f. The net effect is very similar to the way join points work right now, but it would make it multi-level. In fact, doing this might simplify and generalise the way that join points are currently done, where (rather arbitrarily) we duplicate the outer layer of a single case.

Simon

Attachments (1)

UpdateString.hs (6.9 KB) - added by simonpj 4 years ago.
UpdateString? example

Download all attachments as: .zip

Change History (9)

Changed 4 years ago by simonpj

UpdateString? example

comment:1 Changed 4 years ago by igloo

  • Milestone set to 6.14.1

comment:2 Changed 3 years ago by igloo

  • Milestone changed from 7.0.1 to 7.0.2

comment:3 Changed 3 years ago by igloo

  • Milestone changed from 7.0.2 to 7.2.1

comment:4 Changed 3 years ago by reinerp

I just ran into this when playing around with view patterns, with a significantly smaller minimial example:

{-# LANGUAGE ViewPatterns #-}
module Test1 where

data D1 = A | B | C
data D2 = A' | B' | C'

conv A = A'
conv B = B'
conv C = C'

foo :: (b -> D2) -> b -> a -> [a]
foo f (f -> A') a =  [a]
foo f (f -> B')	a =  reverse [a]
foo f (f -> C')	a = reverse (reverse [a]) -- put some junk in the RHS to make it too big to inline
{-# INLINE foo #-}

h :: D1 -> a -> [a]
h a b = foo conv a b

-- "manually inlined" version
j :: D1 -> a -> [a]
j (conv -> A') a =  [a]
j (conv -> B') a =  reverse [a]
j (conv -> C') a = reverse (reverse [a])

With GHC on -O2, we get this core:

Test1.h =
  \ (@ a_ad5) (a_abW :: Test1.D1) (b_abX :: a_ad5) ->
    let {
      $w$j_sfk :: [a_ad5]
      [LclId, Str=DmdType]
      $w$j_sfk =
        case a_abW of _ {
          Test1.A -> Test1.h1 @ a_ad5;
          Test1.B ->
            GHC.List.reverse1
              @ a_ad5
              (GHC.Types.: @ a_ad5 b_abX (GHC.Types.[] @ a_ad5))
              (GHC.Types.[] @ a_ad5);
          Test1.C ->
            GHC.List.reverse1
              @ a_ad5
              (GHC.List.reverse1
                 @ a_ad5
                 (GHC.Types.: @ a_ad5 b_abX (GHC.Types.[] @ a_ad5))
                 (GHC.Types.[] @ a_ad5))
              (GHC.Types.[] @ a_ad5)
        } } in
    case a_abW of _ {
      Test1.A -> GHC.Types.: @ a_ad5 b_abX (GHC.Types.[] @ a_ad5);
      Test1.B -> $w$j_sfk;
      Test1.C -> $w$j_sfk
    }

Interestingly, the "manually inlined" function j has the core we want:

Test1.j =
  \ (@ a_ad7) (ds_ddP :: Test1.D1) (a_abY :: a_ad7) ->
    case ds_ddP of _ {
      Test1.A -> GHC.Types.: @ a_ad7 a_abY (GHC.Types.[] @ a_ad7);
      Test1.B ->
        GHC.List.reverse1
          @ a_ad7
          (GHC.Types.: @ a_ad7 a_abY (GHC.Types.[] @ a_ad7))
          (GHC.Types.[] @ a_ad7);
      Test1.C ->
        GHC.List.reverse1
          @ a_ad7
          (GHC.List.reverse1
             @ a_ad7
             (GHC.Types.: @ a_ad7 a_abY (GHC.Types.[] @ a_ad7))
             (GHC.Types.[] @ a_ad7))
          (GHC.Types.[] @ a_ad7)
    }

Reiner

comment:5 Changed 3 years ago by simonpj

Ah yes, and see #3781 too. Thanks for another good example.

comment:6 Changed 3 years ago by igloo

  • Milestone changed from 7.2.1 to 7.4.1

comment:7 Changed 2 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1
  • Priority changed from normal to low

comment:8 Changed 20 months ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2
Note: See TracTickets for help on using tickets.