wiki:NestedCPR

Version 14 (modified by nomeata, 17 months ago) (diff)

--

This is nomeata’s notepad about the nested CPR information:

Related tickets

  • #1600 Main tickets where I mention progress.

Tickets with stuff that would make nested CPR better:

  • #8598 CPR after IO (partly done)

Related testcases

TODOs

  • Does Nick Frisby’s late λ-lifting alleviate problems when CPR’ing join-points?
    • Need to see if his branch can be merged onto master.
  • Paper-Writeup of CPR
  • Shouldn’t nested CPR help a lot with Complex-heavy code? Is there something in nofib?
  • Try passing CPR information from the scrunitee to the pattern variables. For that: Reverse flow of analysis for complex scrunitees (for simple, we want the demand coming from the body, for complex, this is not so important.)
  • Use ticky-profiling to learn more about the effects of nested CPR.
  • Look at DmdAnal-related [SLPJ-Tickets] and see which ones are affected by nested-cpr.
  • Do not destroy join points (see below).
  • Can we make sure more stuff gets the Converging flag, e.g. after a case of an unboxed value? Should case binders get the Converging flag? What about pattern match variables in strict data constructors? Unboxed values?
  • Why does nested CPR make some stuff so bad?
    • Possibly because of character reboxing. Try avoiding CPR’ing C# alltogether!

join points

CPR can kill join points.

Common Context

Idea to fix this, and possibly more general benefits: http://www.haskell.org/pipermail/ghc-devs/2013-December/003481.html; prototype in branch wip/common-context.

  • On its own, improvements are present but very small: http://www.haskell.org/pipermail/ghc-devs/2013-December/003500.html
  • Enabling CPR for sum types in non-top-level-bindings (which is currently disabled due to worries abut lost join points) yields mixed results (min -3.8%, mean -0.0%, max 3.4%).
  • Enabling sum types inside nested CPR: Also yields mixed, not very promising results (-6.9% / +0.0% / +11.3%).

Direct detection

Alternative: Detect join points during dmdAnal and make sure that their CPR info is not greater than that of the expression they are a join-point for. Would also fix #5075, see 5075#comment:19 for benchmark numbers.

  • On its own, no changes.
  • Enabling CPR for sumtypes: (min -3.8%, mean -0.0%, max 1.7%) (slightly better than with Common Context)
  • Enabling sum types inside nested CPR: TBD

Side tracks

  • Should runSTRep be inlined (see ticket:1600#comment:34)?
  • Can we use Terminates CPR information to eagerly evaluate thunks? Yes, and there is a small gain there: #8655
    • But why no allocation change? Understand this better!
    • Can we statically and/or dynamically count the number of thunks, and the number of CBV’ed thunks?
  • Why is cacheprof not deterministic? (→ #8611)
  • What became of Simon’s better-ho-cardinality branch? See better-ho-cardinality.
  • Try vtunes to get better numbers.