Opened 8 years ago

Last modified 8 months ago

#855 new task

Improvements to SpecConstr

Reported by: simonpj Owned by:
Priority: normal Milestone:
Component: Compiler Version: 6.4.2
Keywords: Cc: hackage.haskell.org@…, ekmett@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: N/A Blocked By:
Blocking: Related Tickets:

Description

There are a series of possible improvemnts to SpecConstr, described in
the source code. compiler/specialise/SpecConstr

  • Specialising for constant parameters
  • Specialising for lambda parameters
  • Two ideas to do with strictness that look more tricky

Some of them look quite straightforward.

Change History (7)

comment:1 Changed 7 years ago by igloo

  • Milestone set to 6.8
  • Test Case set to N/A

comment:2 Changed 6 years ago by simonpj

  • Milestone changed from 6.8 branch to _|_

I'll look at this again when someone finds that SpecConstr really isn't doing the Right Thing for them. My guess is that this will happen either in array fusion for nested data parallelism, or in the use of stream fusion for lists (Coutts et al ICFP'07).

Meanwhile, here is the last message I had from Roman on the subject, which summarises what he did to make SpecConstr work reasonably well for their ICFP paper.

Specialising on constants

Consider the following program (admittedly contrived, but we get this kind of code with stream fusion):

lvl = Just True

foo :: Maybe Bool -> Int -> Int
foo (Just True) i = i
foo _           i = foo lvl i

SpecConstr doesn't optimise this even though it is supposed to specialise on global variables. If I understand correctly, the reason for this is the following test in argToPat:

   -- Check if the argument is a variable that
   -- is in scope at the function definition site
   -- It's worth specialising on this if
   --   (a) it's used in an interesting way in the body
   --   (b) we know what its value is
argToPat in_scope con_env (Var v) arg_occ
   | not (isLocalId v) || v `elemInScopeSet` in_scope,
     case arg_occ of { UnkOcc -> False; other -> True },        -- (a)
     isValueUnfolding (idUnfolding v)                   -- (b)
   = return (True, Var v)

The problem with this is that v might have been cloned (via extendBndr*) and in that case, it won't have an unfolding. That is precisely what happens in the above example, I believe.

I solved this by explicitly recording which binders have a known value
but you might have a better idea.

The threshold

I don't like the spec-threshold idea at all. In fact, it was the first thing I had to disable for the ICFP paper since it just doesn't make sense for the kind of code stream fusion generates. It can produce some very large loops which *must* be specialised; indeed, SpecConstr will usually decrease the program size.

Specialising on lambdas

Specialising on lambdas is absolutely crucial for stream fusion (for nested lists). I completely agree that the direct approach was too fragile. What I did was a rather bad hack but it worked.

  • I extended FloatOut (or, rather, SetLevels) such that it would float out lambdas in recursive arguments, i.e., given
           foo x = ... foo (C (\x. e)) ...
    
    it would produce
           f a1 ... an x = e
           foo x = ... foo (C (f a1 ... an)) ...
    
    where a1,...,an are the free variables of e.
  • This floating would happen before every SpecConstr pass (see below).
  • I extended SpecConstr (very hackishly) to specialise on partial applications of global variables. For the above, it would produce
           foo' a1 ... an = ...
    
    and the rewrite rule
           forall a1 ... an. foo (C (f a1 ... an)) = foo' a1 ... an
    
    Now, f is used directly in foo'. We have a bad staging problem here, however: we definitely want to inline f in foo' (i.e., it should get an INLINE pragma) but not in foo (at least, not before the rule has fired). I solved this by a terrible hack involving core notes but clearly, something better is needed here. Note that we might generate several specialisations which use the same f and we want to inline the latter in every specialisation.
  • Finally, inlining f in foo' and simplifying might expose further opportunities for SpecConstr. For the ICFP paper, I implemented FloatOut/SpecConstr/Simplify loop. Stream fusion with nested lists requires as many iterations as there are nesting levels (each iteration would peel away one nesting level). Note that the current specLoop doesn't help here because we really have to do full simplification (including inlining).

That was basically all I did. We didn't really use the static argument
transformation because specialising on global variables essentially has
the same effect.

comment:3 Changed 6 years ago by simonmar

  • Architecture changed from Unknown to Unknown/Multiple

comment:4 Changed 6 years ago by simonmar

  • Operating System changed from Unknown to Unknown/Multiple

comment:5 Changed 15 months ago by morabbin

  • Type of failure set to None/Unknown

Do we have a place for the kind of knowledge buried in simonpj's comment above? Or a ticket type? (My apologies if I could have found this info elsewhere.)

comment:6 Changed 13 months ago by liyang

  • Cc hackage.haskell.org@… added

comment:7 Changed 8 months ago by ekmett

  • Cc ekmett@… added
Note: See TracTickets for help on using tickets.