Do more inlining into non-recursive join points
In pursuing #14068, Joachim says found this code:
let {
ds5 :: [[Int]]
ds5 = case ==# x1 ww1 of {
__DEFAULT -> go1 (+# x1 1#);
1# -> n
} } in
join {
lvl6 :: [[Int]]
lvl6 = (ds4 : y) : ds5} in
joinrec {
safe :: Int -> Int -> [Int] -> [[Int]]
safe (x2 :: Int) (d :: Int) (ds6 :: [Int])
= case ds6 of {
[] -> jump lvl6;
: q l ->
case x2 of wild6 { I# x3 ->
case q of { I# y1 ->
case /=# x3 y1 of {
__DEFAULT -> ds5;
1# ->
case d of { I# y2 ->
case /=# x3 (+# y1 y2) of {
__DEFAULT -> ds5;
1# ->
case /=# x3 (-# y1 y2) of {
__DEFAULT -> ds5;
1# ->
jump safe
wild6 (I# (+# y2 1#)) l
}}}}}}
}; } in
jump safe ds4 lvl y; } in ...
This is terrible! lvl6
is a join point, but ds5
is not. And it's easy to fix: we should simply float ds5
into lvl6
's rhs.
- *Backgound**. In general, if GHC sees this
-- Version A
x = f x : g y
it'll float out the (f x)
and (g y)
parts thus (B):
-- Version B
a1 = f x
a2 = g y
x = a1 : a2
Reasons:
- In the scope of this binding we might have
case x of (a:b) -> rhs
, and the new form allows us to eliminate the case. - Version A builds a thunk (which, when eval'd) builds thunks for
(f x)
and(g y)
and returns a cons; whereas Version B builds the two thunks ahead of time and allocates a cons directly. Ifx
gets evaluated this is a win, by avoiding an extra thunk. We measured this in our let-floating paper, and B is much better on average.
But if x
is a join point, all this is wrong wrong wrong:
- A join point is never scrutinised by a nested case, so there is no benefit in floating.
- A join point is not implemented by a thunk, so there is no thunk alloc/update to avoid.
In fact, for join points we want to turn Version B into Version A!
Specifically:
-
In
Simplify
, do not callprepareRhs
on join RHSs; nor do floating (seetick LetFloatFromLet
) from the RHS of a join. In fact this seems to be already the case. -
In
OccurAnal
, do not userhsCtxt
for the RHS of a non-recursive join point (seeoccAnalNonRecRhs
). Setting this context makesa1
anda2
get "inside-lam" occurrences (seeoccAnalApp
), and that in turn stopsa1
anda2
getting inlined straight back intox
's RHS in Version B. But for join points we **want** them to be inlined, to get back to Version A.For a recursive join point we probably do not want to inline
a1
anda2
because then they'd be allocated each time round the loop. But in fact we can't have a recursive join point whose RHS is just a cons, so it doesn't really arise. The point is: we only need to take care withrhsCtxt
for non-recursive join points.
Bottom line: just that one fix to OccurAnal
should do the job.
Trac metadata
Trac field | Value |
---|---|
Version | 8.2.1 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |