Changes between Version 28 and Version 29 of Supercompilation


Ignore:
Timestamp:
Jul 23, 2009 8:02:06 AM (5 years ago)
Author:
simonpj
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Supercompilation

    v28 v29  
    33This page is very much a draft and may be incorrect in places. Please fix problems that you spot. 
    44 
    5 == Diffferences from Supero == 
     5== Current status == 
    66 
    7 An attempted list at differences between [http://community.haskell.org/~ndm/supero Supero] and [http://www.csee.ltu.se/~pj/papers/scp/popl09-scp.pdf positive supercompilation for call-by-value] (abbreviated to pscp below).  
     7What next? '''Implement the new algorithm.''' 
    88 
    9 1) Call-by-value vs Call-by-n{ame,eed}. I am not completely clear on whether Supero preserves sharing. (Pscp does, otherwise the improvement theory is not usable and the proof doesn't go through. Previous discussions between Neil and Peter concluded that this is not sufficient - more expressions need to be inlined so we do not want to preserve all the sharing for performance reasons; Simon later pointed out that we must preserve sharing). 
     9 * Write msg, split in the R form.  Still with eager substitution 
     10 * Figure out arity for each top-level (lambda lifted) function, and only inline when it is saturated.  (Write notes in paper, explaining why this might be good.)  NB: linearity becomes simpler, because a variable cannot occur under a lambda. 
     11 * Refined whistle-blowing test 
     12 * Neil's msg idea 
    1013 
    11 2) Termination criterion. Both use the homeomorphic embedding, but there are some small differences. 
    12   a) Pscp checks against fewer previous expressions compared to Supero. (Pscp only looks at expressions on the form R<g es>, where R are evaluation contexts for call-by-name, and g are top/local level definitions. Let-statements, for example, are not bound in our contexts). This could be beneficial for compilation speed, but I don't have any concrete numbers right now. The drawback is that our approach can not perform Distillation; a more powerful transformation that makes the checks even more expensive. This should not be hard to change in an actual implementation though. 
     14Later 
     15 * Faster representation for memo table; a finite map driven by the head function 
     16 * Using lazy substitutions 
     17 * Case-of-case duplication 
     18 * Post-pass to identify deepId 
     19 * Post-pass to undo redundant specialisation?? 
     20 * Neil does "evaluation" before specialising, to expose more values to let, and maybe make lets into linear lets.  We don't. Yet. 
    1321 
    14   b) Generalisation. Pscp currently uses the msg, and Neil has deducted a better way to split expressions. Switching between them should be a matter of changing a couple of lines of code in an actual implementation. 
     22Done  
     23 * State monad and good logging info; Stole SimplMonad. 
     24 * Lambda lifting 
     25 * Add the "loop-breaker" info to interface files (and read it back in). 
     26 * Export unfoldings for recursive functions; does not validate:  
     27    * ds060: Overlapping pattern match? 
     28    * ds061: Turns pattern-matches non-exhaustive 
     29    * dsrun015: Foo.x not in scope 
     30    * driver063: Exposes modules that were invisible earlier.  
     31    * print010: changes output from Integer to GHC.Integer.GMP.Internals.Integer 0 to GHC.Integer.GMP.Internals.S# 0. 
     32    * break026: No show instance 
    1533 
    16   c) simpleTerminate. This roughly corresponds to a mixture between values and what pscp have labeled as "annoying expressions". Neil was spot on in earlier discussions on what annoying expressions are: something where evaluation is stuck (normally because of a free variable in the next position, for example the application "x es"). Simon has an interesting algorithm formulation that avoids annoying expressions altogether and is probably simpler to implement. 
    1734 
    18 3) Inlining decisions. Neil has a more advanced way of determining when things should be inlined or not. I believe that the solution is some kind of union between the inliner paper, Neil's work and possibly some kind of cost calculus for inlining. 
    19  
    20 What is not mentioned in either of our works though is that it is a typed intermediate representation in GHC. System Fc has the casts that might get in the way of transformations (so effectively they are some kind of annoying expressions). Rank-n polymorphism is another potential problem, but I am not sure if that will be a problem with System Fc. These are not fundamental problems that will take two months to solve, but I think that implementing a supercompiler for System Fc is more than just two days of intense hacking, even for someone already familiar with GHC internals. 
    21  
    22 == Progress report at the end of June == 
    23  
    24 A substitution based implementation exists, that transforms append, reverse with accumulating parameters, basic arithmetics and similar things. There are still bugs in the implementation, mainly name capture. It takes 8 seconds to transform the double append example, but there's still plenty of room for improvement with respect to performance. 
     35A substitution-based implementation exists, that transforms append, reverse with accumulating parameters, basic arithmetics and similar things. There are still bugs in the implementation, mainly name capture. It takes 8 seconds to transform the double append example, but there's still plenty of room for improvement with respect to performance. 
    2536 
    2637The typed intermediate representation has caused some trouble, but nothing fundamental.  
     
    4354   * Extending this to specialised functions themselves. 
    4455 
    45 What next? '''Implement the new algorithm.''' 
    46  
    47  * Write drive, msg, split in the R form.  Still with eager substitution 
    48  * Refined whistle-blowing test 
    49  * Neil's msg idea 
    50  
    51 Later 
    52  * Using lazy substitutions 
    53  * Case-of-case duplication 
    54  * Post-pass to identify deepId 
    55  * Post-pass to undo redundant specialisation?? 
    56  * Neil does "evaluation" before specialising, to expose more values to let, and maybe make lets into linear lets.  We don't. Yet. 
    57  
    58 Done  
    59  * State monad and good logging info; Stole SimplMonad. 
    60  * Lambda lifting 
    61  * Add the "loop-breaker" info to interface files (and read it back in). 
    62  * Export unfoldings for recursive functions; does not validate:  
    63     * ds060: Overlapping pattern match? 
    64     * ds061: Turns pattern-matches non-exhaustive 
    65     * dsrun015: Foo.x not in scope 
    66     * driver063: Exposes modules that were invisible earlier.  
    67     * print010: changes output from Integer to GHC.Integer.GMP.Internals.Integer 0 to GHC.Integer.GMP.Internals.S# 0. 
    68     * break026: No show instance 
    6956 
    7057 
     
    9784 * Names:  
    9885   * Why is it sometimes $dShow{v a1lm} and sometimes $dShow_a1lm? The latter is easier to grep for. 
     86 
     87== Diffferences from Supero == 
     88 
     89An attempted list at differences between [http://community.haskell.org/~ndm/supero Supero] and [http://www.csee.ltu.se/~pj/papers/scp/popl09-scp.pdf positive supercompilation for call-by-value] (abbreviated to pscp below).  
     90 
     911) Call-by-value vs Call-by-n{ame,eed}. I am not completely clear on whether Supero preserves sharing. (Pscp does, otherwise the improvement theory is not usable and the proof doesn't go through. Previous discussions between Neil and Peter concluded that this is not sufficient - more expressions need to be inlined so we do not want to preserve all the sharing for performance reasons; Simon later pointed out that we must preserve sharing). 
     92 
     932) Termination criterion. Both use the homeomorphic embedding, but there are some small differences. 
     94  a) Pscp checks against fewer previous expressions compared to Supero. (Pscp only looks at expressions on the form R<g es>, where R are evaluation contexts for call-by-name, and g are top/local level definitions. Let-statements, for example, are not bound in our contexts). This could be beneficial for compilation speed, but I don't have any concrete numbers right now. The drawback is that our approach can not perform Distillation; a more powerful transformation that makes the checks even more expensive. This should not be hard to change in an actual implementation though. 
     95 
     96  b) Generalisation. Pscp currently uses the msg, and Neil has deducted a better way to split expressions. Switching between them should be a matter of changing a couple of lines of code in an actual implementation. 
     97 
     98  c) simpleTerminate. This roughly corresponds to a mixture between values and what pscp have labeled as "annoying expressions". Neil was spot on in earlier discussions on what annoying expressions are: something where evaluation is stuck (normally because of a free variable in the next position, for example the application "x es"). Simon has an interesting algorithm formulation that avoids annoying expressions altogether and is probably simpler to implement. 
     99 
     1003) Inlining decisions. Neil has a more advanced way of determining when things should be inlined or not. I believe that the solution is some kind of union between the inliner paper, Neil's work and possibly some kind of cost calculus for inlining. 
     101 
     102What is not mentioned in either of our works though is that it is a typed intermediate representation in GHC. System Fc has the casts that might get in the way of transformations (so effectively they are some kind of annoying expressions). Rank-n polymorphism is another potential problem, but I am not sure if that will be a problem with System Fc. These are not fundamental problems that will take two months to solve, but I think that implementing a supercompiler for System Fc is more than just two days of intense hacking, even for someone already familiar with GHC internals. 
     103