Opened 4 years ago

Closed 4 years ago

#5034 closed bug (fixed)

Performance of Data.Graph.{preorderF, postorderF}

Reported by: michalt Owned by: simonmar
Priority: normal Milestone: 7.2.1
Component: libraries (other) Version: 7.0.2
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

The functions can be quite slow in some cases, but it's easy to improve the performance. I've just used Data.Sequence instead of ordinary lists. This makes especially large difference for scc (strongly connected components). In one extreme case I found:

benchmarking extreme case, original: scc
collecting 100 samples, 1 iterations each, in estimated 1.521492 s
bootstrapping with 100000 resamples
mean: 15.38460 ms, lb 15.35417 ms, ub 15.41660 ms, ci 0.950
std dev: 160.7804 us, lb 144.4490 us, ub 184.5445 us, ci 0.950
variance introduced by outliers: 0.990%
variance is unaffected by outliers

benchmarking extreme case, new: scc
collecting 100 samples, 12 iterations each, in estimated 559.8664 ms
bootstrapping with 100000 resamples
mean: 480.0413 us, lb 477.9581 us, ub 483.2204 us, ci 0.950
std dev: 13.01511 us, lb 9.219410 us, ub 22.67802 us, ci 0.950
found 5 outliers among 100 samples (5.0%)
  3 (3.0%) low mild
  1 (1.0%) high severe
variance introduced by outliers: 0.998%
variance is unaffected by outliers

The patch is attached. I've also written some example code to illustrate the performance differences and two small quickcheck tests.

What do you think about it?

Attachments (4)

improve-performance-of-data_graph_.dpatch (22.5 KB) - added by michalt 4 years ago.
tests.zip (6.0 KB) - added by michalt 4 years ago.
Slightly modified (contains both the old and new implementation) Data.Graph module along with criterion and quickcheck tests.
improve-performance-of-digraph_preorderf-_ticket-_5034__.dpatch (99.5 KB) - added by michalt 4 years ago.
Changes the preorder and preorderF in Digraph module.
improve-performance-of-data_graph-_ticket-_5034__.dpatch (22.6 KB) - added by michalt 4 years ago.
Revised patch for Data.Graph.

Download all attachments as: .zip

Change History (12)

Changed 4 years ago by michalt

Changed 4 years ago by michalt

Slightly modified (contains both the old and new implementation) Data.Graph module along with criterion and quickcheck tests.

comment:1 Changed 4 years ago by michalt

  • Status changed from new to patch

comment:2 follow-up: Changed 4 years ago by simonmar

In GHC's version of this code we have

postorder :: Tree a -> [a] -> [a]
postorder (Node a ts) = postorderF ts . (a :)

postorderF   :: Forest a -> [a] -> [a]
postorderF ts = foldr (.) id $ map postorder ts

which I presume fixes the problem modulo a constant factor. Care to test with this version to see what the constant factor is?

comment:3 in reply to: ↑ 2 Changed 4 years ago by michalt

Replying to simonmar:

[...] Care to test with this version to see what the constant factor is?

Sure! I didn't know that GHC has a different version of these functions.

comment:4 Changed 4 years ago by michalt

  • Owner set to michalt

So I did some more testing, this time using also the GHC's version of postorder. And it turns out it is slightly (but consistently) faster than the one I wrote (based on Data.Sequence). So I decided to try the same trick with preorder:

preorder' :: Tree a -> [a] -> [a]
preorder' (Node a ts) = (a :) . preorderF' ts

preorderF' :: Forest a -> [a] -> [a]
preorderF' ts = foldr (.) id $ map preorder' ts

preorderF2 :: Forest a -> [a]
preorderF2 ts = preorderF' ts []

and again it's just a bit faster. Anyway, below are some results (functions with suffix Seq are the ones based on Data.Sequence, the ones without any suffix are from Data.Graph):

benchmarking random graph: preorderF . dff
[..]
mean: 14.45173 ms, lb 14.39151 ms, ub 14.52760 ms, ci 0.950
std dev: 344.7061 us, lb 283.2271 us, ub 464.7833 us, ci 0.950
[..]

benchmarking random graph: preorderF2 . dff
[..]
mean: 2.224018 ms, lb 2.202782 ms, ub 2.247461 ms, ci 0.950
std dev: 114.6005 us, lb 98.60243 us, ub 136.9182 us, ci 0.950
[..]

benchmarking random graph: preorderSeq . dff
[..]
mean: 2.266279 ms, lb 2.245766 ms, ub 2.286897 ms, ci 0.950
std dev: 105.3449 us, lb 94.27900 us, ub 118.3094 us, ci 0.950
[..]

benchmarking random graph: postOrd
[..]
mean: 33.65838 ms, lb 33.56979 ms, ub 33.76757 ms, ci 0.950
std dev: 502.9029 us, lb 408.1316 us, ub 638.5748 us, ci 0.950
[..]

benchmarking random graph: postOrdGHC
[..]
mean: 2.254825 ms, lb 2.224924 ms, ub 2.288206 ms, ci 0.950
std dev: 162.0765 us, lb 144.0785 us, ub 195.5252 us, ci 0.950
[..]

benchmarking random graph: postOrdSeq
[..]
mean: 2.301785 ms, lb 2.283586 ms, ub 2.320407 ms, ci 0.950
std dev: 93.74991 us, lb 84.26520 us, ub 106.4114 us, ci 0.950
[..]

benchmarking random graph: scc
[..]
mean: 37.50496 ms, lb 37.40495 ms, ub 37.61987 ms, ci 0.950
std dev: 548.3968 us, lb 474.6578 us, ub 647.1524 us, ci 0.950
[..]

benchmarking random graph: sccGHC
[..]
mean: 5.643530 ms, lb 5.620937 ms, ub 5.678739 ms, ci 0.950
std dev: 140.9852 us, lb 94.59864 us, ub 223.5663 us, ci 0.950
[..]

benchmarking random graph: sccSeq
[..]
mean: 5.678401 ms, lb 5.621075 ms, ub 5.733683 ms, ci 0.950
std dev: 289.0819 us, lb 266.6904 us, ub 313.9232 us, ci 0.950
[..]

benchmarking extreme case: preorderF . dff
[..]
mean: 6.390497 ms, lb 6.360810 ms, ub 6.433112 ms, ci 0.950
std dev: 180.4465 us, lb 135.0194 us, ub 254.0712 us, ci 0.950
[..]

benchmarking extreme case: preorderF2 . dff
[..]
mean: 191.6084 us, lb 191.2365 us, ub 192.0628 us, ci 0.950
std dev: 2.101061 us, lb 1.780396 us, ub 2.735204 us, ci 0.950
[..]

benchmarking extreme case: preorderSeq . dff
[..]
mean: 210.9569 us, lb 210.3300 us, ub 211.7104 us, ci 0.950
std dev: 3.505228 us, lb 3.001931 us, ub 4.055361 us, ci 0.950
[..]

benchmarking extreme case: postOrd
[..]
mean: 15.23066 ms, lb 15.19714 ms, ub 15.27154 ms, ci 0.950
std dev: 189.1826 us, lb 156.2228 us, ub 237.1671 us, ci 0.950
[..]

benchmarking extreme case: postOrdGHC
[..]
mean: 199.9126 us, lb 199.2887 us, ub 200.6160 us, ci 0.950
std dev: 3.404376 us, lb 3.008538 us, ub 4.050542 us, ci 0.950
[..]

benchmarking extreme case: postOrdSeq
[..]
mean: 224.3217 us, lb 223.6758 us, ub 225.1439 us, ci 0.950
std dev: 3.715079 us, lb 3.023668 us, ub 4.834299 us, ci 0.950
[..]

benchmarking extreme case: scc
[..]
mean: 15.54214 ms, lb 15.50617 ms, ub 15.58197 ms, ci 0.950
std dev: 194.2154 us, lb 170.9911 us, ub 221.9285 us, ci 0.950
[..]

benchmarking extreme case: sccGHC
[..]
mean: 464.2387 us, lb 462.3211 us, ub 466.3139 us, ci 0.950
std dev: 10.19258 us, lb 8.882281 us, ub 11.85275 us, ci 0.950
[..]

benchmarking extreme case: sccSeq
[..]
mean: 490.0950 us, lb 488.4120 us, ub 491.8692 us, ci 0.950
std dev: 8.864578 us, lb 7.751136 us, ub 10.22137 us, ci 0.950
[..]

Changed 4 years ago by michalt

Changes the preorder and preorderF in Digraph module.

Changed 4 years ago by michalt

Revised patch for Data.Graph.

comment:5 Changed 4 years ago by igloo

  • Milestone set to 7.2.1

comment:6 Changed 4 years ago by simonmar

  • Owner changed from michalt to simonmar

comment:7 Changed 4 years ago by simonmar

Applied, thanks!

Mon Mar 21 12:08:57 PDT 2011  Michal Terepeta <[email protected]>
  * Improve performance of Data.Graph (ticket #5034).

comment:8 Changed 4 years ago by simonmar

  • Resolution set to fixed
  • Status changed from patch to closed
Note: See TracTickets for help on using tickets.