Opened 3 years ago

Last modified 8 months ago

#5059 new feature request

Pragma to SPECIALISE on value arguments

Reported by: batterseapower Owned by:
Priority: low Milestone: 7.6.2
Component: Compiler Version: 7.0.3
Keywords: Cc: johan.tibell@…, ekmett@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

I've sometimes found myself wishing for this pragma to get some "partial evaluation on the cheap". The idea is to allow something like:

defaultOpts :: Options
defaultOpts = ...

{-# SPECIALISE f defaultOpts :: Int -> Int #-}
f :: Options -> Int -> Int
f opts x = ... f opts ...

This would desugar into this additional code:

{-# RULES "f/spec" f defaultOpts = f_spec_1 #-}
f_spec_1 = (\opts x -> ... ... f opts ...) defaultOpts -- NB: body of f duplicated

The hope is that the simplifier and RULE matcher will tidy this up so we get a nice loop back to f_spec_1 with the body of the function specialised for the particular opts.

This is useful when functions are called often with particular arguments. An example would be where "f" is an edit-distance function which takes costs to be assigned to each edit, strings to be compared and returns an integer distance. In my library, the costs are given almost always going to be the default ones so I want to make that case fast, but I want to allow the user to supply their own set.

This pragma is somewhat subsumed by:

  1. SpecConstr, if the options are algebraic data/literals that are also scrutinised by the body of f
  1. Static argument transformation, except that the RULE based strategy achieves more code sharing compared to SAT

I think that pragma might be a relatively simple to implement nice-to-have feature.

Change History (15)

comment:1 Changed 3 years ago by rl

This would be quite useful but getting this to work properly seems tricky. You really want defaultOpts to be inlined into the body of the specialisation - that's the whole point, after all. On the other hand, you really don't want it to be inlined into the (f defaultOpts) call because then the rule won't match. I think you have to give defaultOpts an INLINE[0] or INLINE[1] pragma to make sure that the specialised function is optimised but the rule still has a chance to fire. That's not nice, though, because you have to remember to do that and it isn't very clear what happens if you don't.

comment:2 Changed 3 years ago by batterseapower

Yes, I elided that detail in my example. Having to be careful with your INLINE pragmas is a general problem related to the RULES mechanism that crops up a lot so I see it as orthogonal to this ticket.

comment:3 Changed 3 years ago by simonpj

I wonder if you could just write

{-# RULES "f/spec" f defaultOpts = f_spec_1 #-}
f_spec_1 = inline f defaultOpts -- NB: body of f duplicated

The inline pseudo-function makes f inline just once, which is what you want. (I'm not certain that it works for recursive functions, but it probably should.)

So this isn't quite a compact as a language extension, but every language extension carries its own costs (eg explaining the interaction with INLINE pragmas, and this way means we don't need to make any changes.

Simon

comment:4 Changed 3 years ago by batterseapower

Good thought but it dosen't work. Here is what happens instead:

  1. The call to inline remains in place up to including phase 1, presumably because GHC can't inline recursive functions into such positions
  1. In phase 0, the call to "inline" is removed and the RHS of f_spec is replaced with (f defaultOpts)
  1. The RULE files and rewrites f_spec = f_spec, building a loop. Ooops!

So at the moment doing this is worse than useless, it actually makes your program less defined :-)

comment:5 Changed 3 years ago by simonpj

Right. Like I said inline would need fixing to be able to inline even recursive functions. But then it might work just fine, right?

comment:6 Changed 3 years ago by batterseapower

Yes, if "inline" reliably inlined 1 copy of even recursive functions then I agree your idea would be a fine replacement for an additional GHC feature.

So we should keep this ticket open until "inline" has the right behaviour.

comment:7 Changed 3 years ago by igloo

  • Milestone set to 7.4.1

comment:8 Changed 3 years ago by simonpj

Here's another example where I'd like to write a specialise-on-value pragma. Consider equality on lists:

eqList [] [] = True
eqList (x:xs) [] = False
eqList [] (y:ys) = False
eqList (x:xs) (y:ys) = x==y && eqList xs ys

I'd like to specialise on the empty list

{-# SPECIALISE eqList [] xs #-}
{#- SPECIALISE eqList xs [] #-}

because in each case the result boils down to null, and that's great at the call sites. This shows up in string_of in nofib spectral/rewrite.

comment:9 Changed 3 years ago by ezyang

It's interesting to note that you can get the desired effect with current inlining technology if you write your recursive function in particular way, transforming:

f opts x = ... f opts ...

into

f opts = worker
  where worker x = ... worker ...

But it's not at all obvious that you can always do this transformation, especially if recursive call to the function uses a different value than the one you are specializing with. So you in fact do have to make a judgment call about how much you want to unroll the definition (I think Max has suggested we only unroll once); a pretty classic partial evaluation problem, if you ask me. :-)

comment:10 Changed 3 years ago by batterseapower

Edward: that is known as the static argument transformation - see #888

In my tests I found that (with some heuristics about when to do it) it sped up nofib by 20-30%! I did commit my code at the time but didn't do the rest of the work to fix all the cases where SAT was making programs worse. In particular I remember it was causing ++ to be inlined and specialised a lot more because the second argument is static. We would like to avoid this.

This might be a nice project if you are interested in this area? I think the perf win will be big.

comment:11 Changed 3 years ago by ezyang

Ah yes, I do remember you mentioning this on the stream fusion ticket #915. And I am looking for PhD projects...

comment:12 Changed 2 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1
  • Priority changed from normal to low

comment:13 Changed 2 years ago by simonpj

  • Cc johan.tibell@… added
  • Difficulty set to Unknown

Johan Tibell writes: While looking at the GCC 4.7 release notes I saw something that's perhaps worth stealing. Taken from the release notes:

    The inter-procedural constant propagation pass has been rewritten. It
    now performs generic function specialization. For example when
    compiling the following:

    void foo(bool flag)
    {
      if (flag)
        ... do something ...
      else
        ... do something else ...
    }
    void bar (void)
    {
      foo (false);
      foo (true);
      foo (false);
      foo (true);
      foo (false);
      foo (true);
    }


    GCC will now produce two copies of foo. One with flag being true,
    while other with flag being false. This leads to performance
    improvements previously possibly only by inlining all calls. Cloning
    causes a lot less code size growth.

I found myself wanting something like this just today. A common pattern I see in code is this:

    data Options = Options { ... }

    defaultOptions :: Options
    defaultOptions = ...

    foo :: ...
    foo = fooWith defaultOptions

    fooWith :: Options -> ...
    fooWith = ...

It'd be nice if we could get foo to specialize on its input arguments, without having to mark all of fooWith as INLINE.

-- Johan

comment:14 Changed 20 months ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:15 Changed 8 months ago by ekmett

  • Cc ekmett@… added
Note: See TracTickets for help on using tickets.