Changes between Version 17 and Version 18 of DataParallel/WorkPlan


Ignore:
Timestamp:
Mar 27, 2009 10:51:47 AM (5 years ago)
Author:
chak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DataParallel/WorkPlan

    v17 v18  
    77 
    88 * '''Code blow up:''' what do we do about dictionary instances, their inlining, floating, and sharing? 
    9  
    10  * '''More expressive rewrite rules:''' Roman has some concrete ideas, which we need to discuss. 
    119 
    1210 * What the status of using TH for generating library boilerplate? 
     
    2018 
    2119 ''Roman'':: 
    22    '''Template Haskell''', '''Replicate''', #2984 & '''Recycling''' 
     20   '''Template Haskell''' & '''Recycling for joinD''' 
    2321   – status: partly implemented, but still needs serious work 
    24    * To use the special representation of task '''Replicate''' most effectively, we would ''again'' need different views on arrays together with a cost function and optimisation rules taking the cost function into account.  That requires a lot of work! 
    25    * We decided that, for the moment, Roman will first try to integrate the replication representation directly and see how far that gets us.  Maybe it helps at least with some examples and gives us something somewhat usable more quickly. 
    2622   * However, before any further major changes to the library, Roman needs to first re-arrange things such that the library boilerplate is generated, instead of being hardcode; otherwise, changes require a lot of tiresome code editing.  This is partially done. 
    2723 
    2824 ''Simon'':: 
    29    '''Code blow up''' 
     25   ''New dictionary representation and optimisation''' 
    3026   – status: unknown 
    3127 
    3228 ''Gabi'':: 
    33    '''Hierarchical matrix representation''' 
    34    – status: partially implemented (needed new library support for shape manipulations that had to be implemented first) 
     29   '''Regular multi-dimensional arrays''' & '''Hierarchical matrix representation''' 
     30   – status: partially implemented benchmark; regular multi-dimensional arrays are in the design phase 
    3531 
    3632 ''Manuel'':: 
     
    4642Category: ''Efficiency'' (improve scalability and/or baseline performance of generated code): 
    4743 
    48  * '''Replicate:''' Implement an extended array representation that uses an optimised representation for arrays that are the result of common forms of replication (i.e., due to free variables in lifted expressions).  The optimised representation stores the data to be replicated and the replication count(s) instead of actually replicating the data.  This also requires all functions consuming arrays to be adapted.  We may habe a new take on this; see '''More expressive rewrite rules'''. 
    49  
    50  * '''More expressive rewrite rules:''' It seems that with more expressive rewrite rules may enable us to handle many of the problems with replicate an friends using rules.  (At least when the proper inlining happens.)  Much of this is needed to optimise shape computations, which in turn enables subsequent optimisations. 
    51  
    52  * '''Recycling:''' Use Roman's recycling optimisation (PADL'09) to avoid copying in `joinD`. 
     44 * '''Recycling for joinD:''' Use Roman's recycling optimisation (PADL'09) to avoid copying in `joinD`. 
    5345 
    5446 * '''Test new inliner:''' Retest package dph with new inliner and the simplifier changes and try to simplify the library on the basis of these new phases. 
     
    5648 * '''Desugaring comprehensions:''' The current desugaring of array comprehensions produces very inefficient code.  This needs to be improved.  In particular, the `map/crossMap` base case in `dotP` for `x<-xs` and use `zipWith` instead of `map/zip`. 
    5749 
     50 * '''Regular multi-dimensional arrays:''' Representing regular arrays by nested arrays is generally not efficient, but due to the current lack of fusion for segmented arrays that produce multiple arrays of different length (e.g., `filterSU :: (a -> Bool) -> SUArr a -> SUArr a`), matters are worse than one might think (e.g., when using a hierarchical matrix representation).  Hence, we need direct support for regular, multi-dimensional arrays. 
     51 
    5852 * '''Unlifted functions:''' For some scalar functions (especially numeric functions and functions operating on enumerations), it is easier to not lift them at all; rather than to lift them and then attempt to recover the original form with fusion and other optimisations.  An example is the `SumSq` benchmark, where we have `sumP (mapP (\x -> x * x) [:1..n:]`.  Here, we would rather not lift `\x -> x * x` at all.  Roman implemented a very simple form of this idea (which works for `SumSq`).  However, we would like this in a more general form, where named functions that remain unlifted are marked suitably, as clients of a function can only be unlifted if all functions it calls are already unlifted.  How much does that tie in with '''Selective vectorisation'''? 
     53 
     54 * '''Replicate:''' Implement an extended array representation that uses an optimised representation for arrays that are the result of common forms of replication (i.e., due to free variables in lifted expressions).  The optimised representation stores the data to be replicated and the replication count(s) instead of actually replicating the data.  This also requires all functions consuming arrays to be adapted.  ''Update:'' Since we have `INLINE CONLIKE`, we can handle `replicate` using rules as long as it gets close to its consumer during simplification.  This makes the use of an extended array representation less important.  We will delay it until we run into examples where its lack bites us significantly. 
    5955 
    6056Category:  ''Compile time'' (improve compile times): 
     
    8884=== Done === 
    8985 
     86 * '''More expressive rewrite rules:''' It seems that with more expressive rewrite rules may enable us to handle many of the problems with replicate an friends using rules.  (At least when the proper inlining happens.)  Much of this is needed to optimise shape computations, which in turn enables subsequent optimisations. [Further investigation suggested the following approach: We want to handle some functions (until their are inlined) like data constructors during rule matching, so that they also match rule arguments when they are bound to a variable.  This is, for example, important to optimise `repeat` in `smvm`.  (GHC now supports `INLINE CONLIKE` pragmas.)] 
     87 
    9088 * '''Scaling:''' Investigate the scaling problems that we are seeing with vectorised code at the moment.  ('''Replicate''' and '''Recycling''' play a role here, but it is unclear whether that's all.  Some benchmarks are simply memory-bandwidth limited.)  [So far, we only found scaling problems due to memory bandwidth of the tested architecture.  Scaling on the Sun T2 is excellent.] 
    9189