Changes between Version 26 and Version 27 of DataParallel/BenchmarkStatus


Ignore:
Timestamp:
Mar 6, 2009 2:03:28 PM (5 years ago)
Author:
chak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DataParallel/BenchmarkStatus

    v26 v27  
    2424|| '''Program''' || '''Problem size''' || '''sequential''' || '''P=1''' || '''P=2''' || '''P=4''' || '''P=8''' || 
    2525|| !SumSq, primitives || 10M || 22 || 40 || 20 || 10 || 5 || 
    26 || !SumSq, vectorised || 10M || 22 || 292 || 170 || 119 || 171 || 
     26|| !SumSq, vectorised || 10M || 22 || 40 || 20 || 10 || 5 || 
    2727|| !SumSq, ref C ||10M || 9 || – || – || – || – || 
    2828|| DotP, primitives || 100M elements || 823/823/824 || 812/813/815 || 408/408/409 || 220/223/227 || 210/214/221 || 
     
    3838==== Comments regarding !SumSq ==== 
    3939 
    40 The "primitives" version works nicely, but the vectorised one exposes some problems: 
     40The versions compiled against `dph-par` are by factor of two slower than the ones linked against `dph-seq`.  This is as the parallel versions needs to compute the length of the array to determine how to split the work. 
     41 
     42However, found a number of general problems when working on this example: 
    4143 * We need an extra -funfolding-use-threshold.  We don't really want users having to worry about that. 
    4244 * `mapP (\x -> x * x) xs` essentially turns into `zipWithU (*) xs xs`, which doesn't fuse with `enumFromTo` anymore.  We have a rewrite rule in the library to fix that, but that's not general enough.  We really would rather not vectorise the lambda abstraction at all. 
    4345 * `enumFromTo` doesn't fuse due to excessive dictionaries in the unfolding of `zipWithUP`. 
     46 * Finally, to achieve the current result, we needed an analysis that avoids vectorising subcomputations that don't to be vectorised, and worse, that fusion has to turn back into their original form.  In this case, the lambda abstraction `\x -> x * x`.  This is currently implemented in a rather limited and ad-hoc way.  We should implement this on the basis of a more general analysis. 
    4447 
    4548==== Comments regarding DotP ====