Version 3 (modified by 4 years ago) (diff) | ,
---|

## wave4main

Baseline: [0e2fd3/ghc], Tested: nested-cpr (without nesting inside sum-types, without join-point detection).

Found a 11% increase in allocation, around `9000000`

bytes.

The most obvious change in ticky-ticky-number are:

`FUNCTION ENTRIES`

and`ENTERS`

increasing by ~100000`RETURNS`

doubling from 140745 to 280795-
`ALLOC_FUN_ctr`

and`ALLOC_FUN_gds`

almost doubling, by ~18000 resp. 9000000

So we are allocating more function closures. First guess: Join point property destroyed somewhere.

The ticky output shows a `$wgo{v s60k} (main:Main)`

appearing that was not there before, with `140016`

enters and `23522688`

allocations. This appears in `$wtabulate`

, and indeed corresponds to a `go1`

that is a join-point before. So what is happening? We are changing

go1 [Occ=LoopBreaker] :: GHC.Prim.Int# -> GHC.Prim.State# s -> (# GHC.Prim.State# s, GHC.Arr.Array GHC.Types.Int x #)

to

$wgo [Occ=LoopBreaker] :: GHC.Prim.Int# -> GHC.Prim.State# s -> (# GHC.Prim.State# s, GHC.Prim.Int#, GHC.Prim.Int#, GHC.Prim.Int#, GHC.Prim.Array# x #)

`go1`

is recursive, but tail-recursive, so the worker and wrapper indeed cancel for the recursive call. But where it is being used, we simply apply the `Array`

constructor to the second component. So nothing is gained, but a join-point is lost.

My attempt below to detect join points does not help: The CPR information for `go1`

is the same as for `let go1 = rhs in body`

, as `body`

is just `go1 ww ipv`

.

The problem is that this is being passed as an argument to a function. In this case `runSTRep`

(which is the function its being passed to) might help, but what is the general solution?

In this case, inlining `runSTRep`

helps a bit, as it decreases allocation to `+6.6%`

. (Whether that is correct is quite dubious). But in order to propagate the CPR’ing to tabulate, it seems one needs to pass nested CPR information from a case scrunitee to its components.

So I implement this, and now `tabulate`

itself gets a CPR property, which comes from the second component of the `ST`

action, as extracted by `runSTRep`

. But we still do not retain the join-point-property, as the code that throws away the `ST`

token is still there:

case $wgo ww ipv of _ [Occ=Dead] { (# ww3, ww4, ww5, ww6, ww7 #) -> (# ww4, ww5, ww6, ww7 #)

At that point I am running out of ideas.

## Summary with simple expressions

Original code:

f a x = case a of True -> case foo of b -> snd $ let go 0 = (1,(2,3)) go n = go (n-1) in go b False -> undefined

Initial CPR transformation yields, where `$go`

is no longer a join-point for the argument to `snd`

.

f a x = case a of True -> case foo of b -> snd $ let $wgo 0 = (# 1, 2, 3 #) $wgo n = go (n-1) in case $wgo b of (# a, b, c #) -> (a, (b,c)) False -> undefined

Inlining `snd`

does not help a lot:

f a x = case a of True -> case foo of b -> let go 0 = (1,(2,3)) go n = go (n-1) in case (go b) of (x,y) -> y False -> undefined

still turns into

f a x = case a of True -> case foo of b -> snd $ let $wgo 0 = (# 1, 2, 3 #) $wgo n = go (n-1) in case $wgo b of (# a, b, c #) -> (b,c) False -> undefined

Furthermore allowing CPR information to flow from scrunitee to bound variables gives `f`

a CPR property (which is good), but does not solve the problem.

$wf a x = case a of True -> case foo of b -> snd $ let $go 0 = (# 1, 2, 3 #) $go n = go (n-1) in case $go b of (# a, b, c #) -> (# b, c #) False -> undefined