Opened 2 years ago

Last modified 2 years ago

#13207 new bug

Performance regressions from removing LNE analysis

Reported by: lukemaurer Owned by: lukemaurer
Priority: normal Milestone:
Component: Compiler Version: 8.1
Keywords: JoinPoints Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: 12988 Differential Rev(s):
Wiki Page:

Description (last modified by lukemaurer)

CoreToStg's LNE analysis has been superseded by the occurrence analyzer's join-point analysis. However, there are a few cases where the occurrence analyzer doesn't find a join point where the old analysis would've found an LNE. Known causes include:

  1. The LNE analysis was untyped, which (very) occasionally allows an LNE where a join point would've fallen afoul of the polymorphism rule (see Note [The polymorphism rule of join points] in CoreSyn.hs).
  2. Sometimes a rewrite rule prevents a binding from being christened a join point; such a binding would've been found by the LNE analysis (which runs after Core Tidy has removed rules from local binders).

The biggest loss in NoFib is spectral/cryptarithm2, which saw a 0.3% increase in allocations because a rule prevented detection of a join point. Specifically, there's a go that gets SpecConstr'd as an $sgo. All the calls to $sgo happen to be tail calls, but some calls to go aren't, so it's only the rule that keeps $sgo from becoming a join point.

In summary (adapted from the cryptarithm2 situation):

letrec $sgo = \x y z -> ... go {- non-tail call -} ...   
       {-# RULE "SPEC go" forall x y z. go ((x,y):z) = $sgo x y z #-}
       go = \xs -> ... go ...
in ... go ... $sgo a b c {- tail call -} ...

The only thing keeping $sgo from becoming a join point is the rule. Possible solutions:

  1. Run the occurrence analyzer and the simplifier (or simpleOptPgm) after Tidy Core. Seems like the solution needing the least code. Problems:
    • Tidy Core turns top-level binders into GlobalIds. Would need to stop doing that there or adapt the occurrence analyzer and the simplifier (or simpleOptPgm).
    • Only covers case 2.
  1. Write a new pass that just throws out all the rules from the local binders in the module, then run it (and then the simplifier) at the end of the Core-to-Core pipeline, just before Core Tidy. (Local rules don't appear in ifaces anyway.)
    • Needs a whole new traversal, albeit a very simple one. Though it's possible that dropping the rules would lead to other optimizations ($sgo would end up pre-inlined in the above example), so it might be helpful anyway.
    • Also only covers case 2.
  1. Write the LNE analysis as an STG-to-STG pass. Only idea that covers both cases.
    • Needs even more code, and duplicative of the occurrence analyzer.
  1. Loosen the join-point invariants somehow. For example, if we could flag a rule as applying only to tail calls, that rule could contain a jump in its RHS even though its LHS isn't a jump.
    • But this is already problematic in the above example—we also can't make $sgo a join point so long as it's mutually recursive with go, which it still is so long as the rule is there. In other words, even if the rule doesn't directly keep $sgo from being a join point, it's still the only thing keeping it from breaking free of go's recursive group, which it must do to become a join point.
    • So it doesn't cover case 2 very well and it doesn't cover case 1 at all.

Change History (3)

comment:1 Changed 2 years ago by lukemaurer

Version: 8.0.18.1

comment:2 Changed 2 years ago by lukemaurer

Description: modified (diff)

comment:3 Changed 2 years ago by lukemaurer

Description: modified (diff)
Note: See TracTickets for help on using tickets.