| 17 | [http://darcs.haskell.org/packages/dph/examples/quickhull/ Quickhull]:: |
| 18 | Given a set of points (in a plane), compute the sequence of points that encloses all points in the set. There is only a vectorised version. '''Currently doesn't work due to bugs in dph-par. Also needs to get a wrapper using the new benchmark framework to generated test input and time execution.''' |
| 19 | [http://darcs.haskell.org/packages/dph/examples/qsort/ Quicksort]:: |
| 20 | FIXME |
| 21 | [http://darcs.haskell.org/packages/dph/examples/concomp/ ConComp]:: |
| 22 | Implementation of the Awerbuch-Shiloach and Hybrid algorithms for finding connected components in undirected graphs. There is only a version directly coded against the array primitives. '''Needs to be adapted to new benchmark framework.''' |
| 23 | [http://darcs.haskell.org/packages/dph/examples/barnesHut/ BarnesHut]:: |
| 24 | This benchmark implements the Barnes-Hut algorithm to solve the ''n''-body problem in two dimensions. '''Currently won't compile with vectorisation due to excessive inlining of dictionaries.''' |
| 25 | |
88 | | ---- |
89 | | |
90 | | || '''Program''' || '''Sequential (manually vectorised) ''' || '''Vectorised''' || ''' Parallel''' || |
91 | | || DotP ||Order of mag. faster than list impl || Same performance as seq. || speedup of 2 for 2 CPUs, 4 threads || |
92 | | || !QuickSort ||Slower than list (fusion) || Slower than seq. (why?) || speedup of 1.4 on 2 CPUs || |
93 | | || !SparseVector ||Similar to DotP || || || |
94 | | || Primes (Nesl) ||15 x faster than list version || NYI || 20 x slower than seq (fusion?) || |
95 | | || Primes (Simon) ||NYI || Working || NYI || |
96 | | || !BarnesHut ||Small bug in alg || Working || See seq. || |
97 | | |
98 | | |
99 | | General remarks: |
100 | | |
101 | | * I only ran a first set of benchmarks when checking for what's there. I'll run the benchmarks properly as next step |
102 | | |
103 | | * Fusion doesn't work well on parallel programs yet, so for all but simple examples, the parallel program performs worse than the sequential |
104 | | |
105 | | * The compiler doesn't exploit all fusion opportunities for QSort and !BarnesHut. Once this is fixed, they should run considerably faster. |
106 | | |
107 | | * Interestingly, the automatically vectorised version of qsort is quite a bit faster than the hand-flattened. Need to find out why. |
108 | | |