Narrow the scope of the notorious "state hack"
|Reported by:||simonpj||Owned by:|
|Type of failure:||Runtime performance bug||Test Case:|
|Related Tickets:||Differential Rev(s):|
The "state hack" has caused any number of bug reports (just search for that string), the most recent of which is #9349.
Here's an idea to make it less pervasive: (roughly) use the state hack only for top-level functions definitions.
The idea is that for nested lambdas the context should give the one-shot-ness, now that we are equipped with cardinality analysis. For example, consider the call
replicatM_ 1000 (\(s :: RealWorld#) -> blah)
The lambda is 1000-shot, not one-shot, notwithstanding the type of the binder. Moreover replicateM_'s strictness/cardinality signature will say just that, and GHC already knows how to propagate that information onto the \s.
But for top level functions like
pr :: String -> IO () pr x = putStrLn (reverse x)
we get Core
pr = \x. let y = reverse x in \ (s :: State# RealWorld). putStrLn y s
and, since we can't see all the callers of pr, we don't know if work is lost by pushing the reverse call inside, to get
pr = \x. (s :: State# RealWorld). putStrLn (reverse x) s
which is much more efficient. Indeed, this might not be so good if the calls looked like
... replicateM_ 1000 (pr "foo")...
because then "foo" will be reversed 1000 times. But arguably that's what the programmer expects anyway, looking at the code; and the efficiency hit from not eta-expanding all functions like pr (which are very very common) is significant.
The point is the that the only ones that need hacking are the top-level guys, and maybe even the top-level exported guys.
I have not fully thought this through, let alone tried it out, but I wanted to capture the thought. It would need some careful performance testing.