Changes between Version 40 and Version 41 of NestedCPR


Ignore:
Timestamp:
Jan 15, 2014 3:11:14 PM (20 months ago)
Author:
nomeata
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • NestedCPR

    v40 v41  
    103103=== Side tracks ===
    104104
    105  * Can we use `Converges` CPR information to eagerly evaluate thunks? Yes, and there is a small gain there: #8655.
    106   * Do it in Core to STG!
    107   * But why no allocation change? Understand this better!
    108   * Can we statically and/or dynamically count the number of thunks, and the number of CBV’ed thunks?
    109     * Statically: Add debug statements
    110     * Dynamically: Look for decrease in thunk allocations in ticky.
    111  * Can we use `Converges` in `exprOkForSpeculation`?
     105 * Use `Converges` in `exprOkForSpeculation`: Mostly done, see [ticket:8655#comment:8].
     106   * I should get dynamic numbers, but given the static ones I doubt that these are worth collecting.
    112107 * Why is `cacheprof` not deterministic? (→ #8611)
    113108 * What became of Simon’s better-ho-cardinality branch? See [./better-ho-cardinality].
    114109 * Try vtunes to get better numbers.
    115 
    116 ==== Use Converges in exprOkForSpeculation ====
    117 
    118 The flag `Converges` has just about the same meaning as `exprOkForSpeculation`, so we can improve the latter by using the former.
    119 
    120 Effect on nofib is minuscule: `-0.1%` allocations for `fluid`, no other change in allocations.
    121 
    122 In `fluid` there are thunks calling `read_n_val`, which has a definition of `(.. ,...)`. CPR turns that in to (# ..., ... #). So currently, we are allocating a thunk for the worker of `read_n_val`, which when called will allocate a `(..,..)`, which later is taken apart by yet another two thunks `fst ..` and `snd ..`. After using the `Converges` flag, we immediately call `$wread_n_val`, which returns quickly after allocating two thunks. This also saves the thunks for `fst ..` and `snd ..`.
    123 
    124 But: This already happens not in CorePrep, but in `Float out`. It seems that the call form `lvlCase` makes the difference, as the levels differ and later evaluation is moved out of a `(#..,..#)` construct.
    125 
    126 TODO:
    127  * Get static numbers: How many things are ok for speculation before, how many afterwards