Changes between Version 11 and Version 12 of DataParallel/BenchmarkStatus


Ignore:
Timestamp:
Mar 1, 2009 11:27:27 AM (5 years ago)
Author:
chak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DataParallel/BenchmarkStatus

    v11 v12  
    88 
    99 [http://darcs.haskell.org/packages/dph/examples/dotp/ DotP]:: 
    10   Computes the dot product of two vectors of `Double`s.  There are two variants of this program: (1) "primitives" is directly coded against the array primitives from package dph and (2) "vectorised" is a high-level DPH program transformed by GHC's vectoriser. 
     10  Computes the dot product of two vectors of `Double`s.  There are two variants of this program: (1) "primitives" is directly coded against the array primitives from package dph and (2) "vectorised" is a high-level DPH program transformed by GHC's vectoriser.  In addition to these two DPH variants of the dot product, we also have two non-DPH reference implementations: (a) "ref Haskell" is a Haskell program using imperative, unboxed arrays and and (b) "ref C" is a C implementation using pthreads. 
    1111 [http://darcs.haskell.org/packages/dph/examples/smvm/ SMVM]:: 
    1212  Multiplies a dense vector with a sparse matrix represented in the ''compressed sparse row format (CSR).''  There are three variants of this program: (1) "primitives" is directly coded against the array primitives from package dph and (2) "vectorised" is a high-level DPH program transformed by GHC's vectoriser. 
     
    2222|| DotP, vectorised || 100M elements || 823/824/824 || 814/816/818 || 412/417/421 || 222/225/227 || 227/232/238 || 
    2323|| DotP, ref Haskell || 100M elements || – || 810 || 601 || 221 || 209 || 
     24|| DotP, ref C || 100M elements || – || 458 || 235 || 210 || 210 || 
    2425|| SMVM, primitives || ?? elems, density ?? ||  ||  ||  ||  ||  || 
    2526|| SMVM, vectorised || ?? elems, density ?? ||  ||  ||  ||  ||  || 
     
    4647 * Fusion doesn't work well on parallel programs yet, so for all but simple examples, the parallel program performs worse than the sequential 
    4748 
    48  * The compiler doesn't exploit all fusion opportunities for QSort and BarnesHut. Once this is fixed, they should run considerably faster. 
     49 * The compiler doesn't exploit all fusion opportunities for QSort and !BarnesHut. Once this is fixed, they should run considerably faster. 
    4950 
    5051 * Interestingly, the automatically vectorised version of qsort is quite a bit faster than the hand-flattened. Need to find out why.