Changes between Version 56 and Version 57 of Supercompilation


Ignore:
Timestamp:
Nov 20, 2009 1:51:37 PM (6 years ago)
Author:
simonpj
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Supercompilation

    v56 v57  
    55== Current Bugs ==
    66
    7 * Naivrec: Double runtime
    8 
    9 * nbody: unsafe-things, segfault.
    10 
    11 * boyer2: head: empty list; splitTerm
    12 
    13 * Sieve2: run with argument 3. Wrong output.
    14 
    15 == Open shortcomings ==
    16 
    17 * Change representation in rho
    18 
    19 * case var subst
    20 
    21 * strictness annotations
    22 
    23 * unsafePerformIO etc.
    24 
    25 * Extend homemb.
    26 
    27 == Lambda lifting ==
    28 
    29 * We lambda lift to avoid re-specialising the same local function in two different branches: letrec g = .. in (f1 (g x), f2 (g x))
    30 
    31 == Problems ==
    32 
    33 * The Simplifier gets confused by the wrong OccInfo on things. So run occurence analysis at the end of supercompilation. The occurence analyser gets confused by having the wrong Unfoldings for id's though. We currently zap things here and there, but this is not the right way to do it.
    34 
    35 == Scalability ==
    36 
    37 * The whistle is THE bad guy when it comes to performance; we spend 35% of our time on testing. It is trivial to run in parallell.
     7 * Naivrec: Double runtime
     8 * nbody: unsafe-things, segfault.
     9 * boyer2: head: empty list; splitTerm
     10 * Sieve2: run with argument 3. Wrong output.
     11 * The Simplifier gets confused by the wrong `OccInfo` on things. So run occurrence analysis at the end of supercompilation. The occurrence analyser gets confused by having the wrong Unfoldings for id's though. We currently zap things here and there, but this is not the right way to do it.
     12
     13== Shortcomings of the prototype ==
     14
     15 * Use a state monad
     16 * Uses eager substitution
     17 * Divide by zero
     18 * Homeomorphic embedding for types?  Currently all types are regarded as equal (like literals).  Decision: leave it this way for now.
     19 * Msg does not respect alpha-equivalence.  If we match lambda against lambdas, and the binders differ, we say "different".  Decision: deal with alpha-equiv in msg when we have the new alg working.
     20 * Inlining `unsafePerformIO`
     21 * Adding constraint info
     22   * case (x>y)of { ....case (x>y) of ... }
     23   * Extending this to specialised functions themselves.
     24 * Change representation in rho
     25 * case var subst
     26 * strictness annotations
     27 * Extend homemb.
    3828
    3929
    4030== Insights ==
    4131
    42 * Looking for instances, instead of renamings, has implications with the global store. Do we want to fold append (append xs' ys) zs against append (append xs ys) zs (in rho) or append ys zs (in store).
    43 
    44 * Applications are not saturated in Core, there's eta::GHC.Prim.State# GHC.Prim.RealWorld roughly everywhere.
    45 
    46 * The whistle blows on several expressions sometimes. We need to sort them. Example:
    47 
     32 * The whistle is THE bad guy when it comes to performance; we spend 35% of our time on testing. It is trivial to run in parallel.
     33
     34 * We lambda lift to avoid re-specialising the same local function in two different branches: letrec g = .. in (f1 (g x), f2 (g x))
     35
     36 * Looking for instances, instead of renamings, has implications with the global store. Do we want to fold append (append xs' ys) zs against append (append xs ys) zs (in rho) or append ys zs (in store).
     37
     38 * Applications are not saturated in Core, there's eta::GHC.Prim.State# GHC.Prim.RealWorld roughly everywhere.
     39
     40 * The whistle blows on several expressions sometimes. We need to sort them. Example:
     41{{{
    4842case GHC.Num.- @ GHC.Types.Int GHC.Num.$fNumInt (GHC.Num.+ @ GHC.Types.Int GHC.Num.$fNumInt i (Main.check l)) (Main.check r) of _ {
    4943  GHC.Types.I# y [ALWAYS Once Nothing] -> GHC.Types.I# (GHC.Prim.-# (GHC.Prim.+# x 0) y)
    5044}
    51 
    52 against both of these:
    53 
     45}}}
     46  against both of these:
     47{{{
    5448case Main.check r of _ {
    5549  GHC.Types.I# y [ALWAYS Once Nothing] -> GHC.Types.I# (GHC.Prim.-# (GHC.Prim.+# x 0) y)
    5650}
    5751
    58 
    5952GHC.Num.- @ GHC.Types.Int GHC.Num.$fNumInt (GHC.Num.+ @ GHC.Types.Int GHC.Num.$fNumInt i (Main.check l)) (Main.check r))
    60 
    61 The latter is better to generalise against. How do we capture this? Perhaps by sorting on the length of free variables in rho.
     53}}}
     54  The latter is better to generalise against. How do we capture this? Perhaps by sorting on the length of free variables in rho.
     55
     56== Open questions ==
     57
     58 * The new msg for different types. What should the outcome of:
     59{{{
     60msg (rev @ Int (xs::[Int]) (rev @ Bool (xs::[Bool]))
     61}}}
     62  be? We currently fail and split to `[rev @ Int/z]` and `z xs` instead.
     63
     64 * Rank-n-polymorphism. Given: `msg (f e) (f e')`, we return `f (z::typeof e)` and `[e/z]`. I'm not sure if it's possible to create an example where we should have taken z to have the same type as the type signature for the argument of f instead.
     65
     66 * You had a suggestion to convert the entire program to the zipper-representation before supercompiling.
     67
     68 * The supercompiler can after transformation leave deep identity functions in place, that will traverse the entire structure and re-assemble it.  If you supercompile `flip (flip t)` the resulting program will be `deepId t` where
     69{{{
     70deepId (Leaf d)     = Leaf d
     71deepId (Branch l r) = Branch (deepId l) (deepId r)
     72}}}
     73  Neil removed these in a post-pass as 
     74far as I understood, and he said it was a good idea.
     75
     76 * Using other constraints than equality. This means things like
     77{{{
     78        case (a>b) of ....(case a+1>b of ...) ...
     79}}}
     80  (This could save more transformation time than it spends if it manages to eliminate a lot of case-alternatives, and intuitively it doesn't feel prohibitively expensive for many constraint domains).
     81
     82 * Should R contexts include let-statements? Need to worry about name capture even more then.
     83
     84 * Should matching for renamings be modulo permutation of lets? (Performance vs code size)
     85
     86 * Consider `(\x xs. append x xs)`.  Do we inline append, and create a specialised copy?  (Of course, identical to the original definition.)
     87   * '''Yes'''.  At provided we don't create ''multiple'' specialised copies, we are effectively copying library code into the supercompiled program.  Then we can discard all libraries (provided we have all unfoldings).
     88   * '''No''': then need to keep the libraries
     89   But it's not clear that we can ''always'' inline ''everything''.  For example things with `unsafePerformIO`.
     90     * (Given no cycle in imports) Perhaps the things we can not inline should be put at the top level in the same module, and the old module discarded?
     91
     92 * Can we improve the homeomorphic embedding so that append xs xs is not embedded in append xs ys?
     93
     94 * http://hackage.haskell.org/trac/ghc/ticket/2598
     95
     96
    6297
    6398
     
    104139The typed intermediate representation has caused some trouble, but nothing fundamental.
    105140
    106 Shortcomings of the prototype:
    107  * Use a state monad
    108  * Uses eager substitution
    109  * Divide by zero
    110  * Homeomorphic embedding for types?  Currently all types are regarded as equal (like literals).  Decision: leave it this way for now.
    111  * Msg does not respect alpha-equivalence.  If we match lambda against lambdas, and the binders differ, we say "different".  Decision: deal with alpha-equiv in msg when we have the new alg working.
    112  * Inlining `unsafePerformIO`
    113  * Adding constraint info
    114    * case (x>y)of { ....case (x>y) of ... }
    115    * Extending this to specialised functions themselves.
     141Random thoughts about the prettyprinter:
     142
     143 * Let-statements:
     144   * LclId is not really useful for me.
     145   * The empty list on the line below it is also wasting space.
     146   * The occurence information sometimes pushes the type signature over several lines.
     147 * Case-statements:
     148   * The occurence information pushes case branches over several lines.
     149   * case e of { tp1 tp2 tp3 tp4 -> tp3 } prettyprints over 5 lines.
     150 * Long types: Is the complete "GHC.Types.Char" necessary?
     151 * Names:
     152   * Why is it sometimes $dShow{v a1lm} and sometimes $dShow_a1lm? The latter is easier to grep for.
    116153
    117154== Performance ==
     
    149186}}}
    150187
    151 == Open questions ==
    152 
    153  * Should R contexts include let-statements? Need to worry about name capture even more then.
    154 
    155  * Should matching for renamings be modulo permutation of lets? (Performance vs code size)
    156 
    157  * Consider `(\x xs. append x xs)`.  Do we inline append, and create a specialised copy?  (Of course, identical to the original definition.)
    158    * '''Yes'''.  At provided we don't create ''multiple'' specialised copies, we are effectively copying library code into the supercompiled program.  Then we can discard all libraries (provided we have all unfoldings).
    159    * '''No''': then need to keep the libraries
    160    But it's not clear that we can ''always'' inline ''everything''.  For example things with `unsafePerformIO`.
    161      * (Given no cycle in imports) Perhaps the things we can not inline should be put at the top level in the same module, and the old module discarded?
    162 
    163  * Can we improve the homeomorphic embedding so that append xs xs is not embedded in append xs ys?
    164 
    165  * http://hackage.haskell.org/trac/ghc/ticket/2598
    166 
    167 Random thoughts about the prettyprinter:
    168 
    169  * Let-statements:
    170    * LclId is not really useful for me.
    171    * The empty list on the line below it is also wasting space.
    172    * The occurence information sometimes pushes the type signature over several lines.
    173  * Case-statements:
    174    * The occurence information pushes case branches over several lines.
    175    * case e of { tp1 tp2 tp3 tp4 -> tp3 } prettyprints over 5 lines.
    176  * Long types: Is the complete "GHC.Types.Char" necessary?
    177  * Names:
    178    * Why is it sometimes $dShow{v a1lm} and sometimes $dShow_a1lm? The latter is easier to grep for.
    179188
    180189== Diffferences from Supero ==