3 ways to write a function (unexpected performance difference and regression)
|Reported by:||claus||Owned by:||igloo|
|Type of failure:||Runtime performance bug||Test Case:||T4474a T4474b T4474c|
|Related Tickets:||Differential Rev(s):|
Consider the attached comparison.hs, a small performance test for difference lists. I was somewhat surprised by the performance differences between flatListCons and flatDList, and I tried to reproduce the issue by writing 3 equivalent versions of flatListCons.
flatListCons t = flat t  where flat (Leaf n) ns = n:ns flat (Fork a b) ns = flat a (flat b ns) flatListCons2 t = flat t  where flat (Leaf n) = \ns -> n:ns flat (Fork a b) = \ns -> flat a (flat b ns) flatListCons3 t = flat t  where flat (Leaf n) = (n:) flat (Fork a b) = flat a . flat b
Not only does GHC not give equal performance for the 3 versions (factor of 2), but GHC-6.12.3 and GHC-7.1.20101022 differ in optimizations. With GHC-6.12.3, flatListCons and flatListCons2 are fast, flatListCons3 is slow. With GHC-7.1.20101022, only flatListCons is fast, flatListCons2 and flatListCons3 are slow:
$ time ./comparison-6.12.3.exe c 26 67108864 real 0m4.574s user 0m0.015s sys 0m0.358s $ time ./comparison-6.12.3.exe c2 26 67108864 real 0m4.442s user 0m0.046s sys 0m0.312s $ time ./comparison-6.12.3.exe c3 26 67108864 real 0m10.694s user 0m0.046s sys 0m0.328s $ time ./comparison-7.1.20101022.exe c 26 67108864 real 0m4.473s user 0m0.046s sys 0m0.327s $ time ./comparison-7.1.20101022.exe c2 26 67108864 real 0m10.791s user 0m0.046s sys 0m0.327s $ time ./comparison-7.1.20101022.exe c3 26 67108864 real 0m10.729s user 0m0.078s sys 0m0.280s
Ideally, I'd like all three versions (as well as flatDList) to be equally fast.