Changes between Version 6 and Version 7 of DataParallel/WorkPlan


Ignore:
Timestamp:
Jan 27, 2009 1:37:18 PM (5 years ago)
Author:
chak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DataParallel/WorkPlan

    v6 v7  
    1313 
    1414 ''Gabi'':: 
     15   '''Hierarchical matrix representation''' 
     16   – status: just started 
    1517 
    1618 ''Manuel'':: 
     
    2022=== Open tasks === 
    2123 
    22 Category: ''efficiency'' (improve scalability and/or baseline performance of generated code): 
     24Category: ''Efficiency'' (improve scalability and/or baseline performance of generated code): 
    2325 
    2426 * '''Replicate:''' Implement an extended array representation that uses an optimised representation for arrays that are the result of common forms of replication (i.e., due to free variables in lifted expressions).  The optimised representation stores the data to be replicated and the replication count(s) instead of actually replicating the data.  This also requires all functions consuming arrays to be adapted. 
     
    4446 * '''Unboxed values:''' Extend vectorisation to handle unboxed values. 
    4547 
    46  * '''Prelude:''' Extend vectorisation to the point, where it can compile the relevant pieces of the standard Prelude, so that we can remove the DPH-specific mini-Prelude.  (Requires: '''Unboxed values) 
     48 * '''Prelude:''' Extend vectorisation to the point, where it can compile the relevant pieces of the standard Prelude, so that we can remove the DPH-specific mini-Prelude.  (Requires: '''Unboxed values''') 
    4749 
    4850Category: ''Case studies'' (benchmarks and example applications): 
    4951 
    50  * '''Matrix representation:'''  
     52 * '''Hierarchical matrix representation:''' Sparse matrices can be space-efficiently represented by recursively decomposing them into four quadrants.  Decomposition stops if a quadrant is smaller than a threshold or contains only zeros.  Multiplication of such matrices is straight forward using Strassen's divide-and-conquer scheme, which is popular for parallel implementations.  Other operations, such as transposing a matrix, can also be efficiently implemented.  The plan is to experiment with the implementation of some BLAS routines using this representation. 
    5153 
    52  * '''N-body:''' Get a fully vectorised n-body code to run. 
     54 * '''N-body:''' Get a fully vectorised n-body code to run and scale well on !LimitingFactor.