Strictness analysis regression
|Reported by:||tibbe||Owned by:|
|Operating System:||MacOS X||Architecture:||x86_64 (amd64)|
|Type of failure:||Runtime performance bug||Test Case:|
|Related Tickets:||Differential Rev(s):|
Description (last modified by tibbe)
Edit: There were two issues discussed here. One is solved. I left the ticket open for the strictness analysis regression part. Analysis of strictness regression starts in comment 7 below.
I ran a simple benchmark that exercises Data.HashMap.Lazy.insert. It's 16% slower using HEAD compared to using 7.6.3. The generated Core is a bit different and the generated Cmm is quite a bit different.
Steps to reproduce
- Download the attached HashMapInsert.hs benchmark.
- Install unordered-containers with both 7.6.3 and HEAD:
$ cabal install -w ghc-7.6.3 unordered-containers-0.2.3.3 $ cabal install -w inplace/bin/ghc-stage2 unordered-containers-0.2.3.3
- Compile the benchmark with both compilers:
$ ghc-7.6.3 -O2 HashMapInsert.hs $ mv HashMapInsert HashMapInsertOld $ inplace/bin/ghc-stage2 -O2 HashMapInsert.hs $ mv HashMapInsert HashMapInsertNew
Results (best of 3 runs)
$ ./HashMapInsertOld +RTS -s 1,191,223,528 bytes allocated in the heap 141,978,520 bytes copied during GC 37,811,840 bytes maximum residency (8 sample(s)) 22,378,432 bytes maximum slop 99 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 2277 colls, 0 par 0.06s 0.06s 0.0000s 0.0002s Gen 1 8 colls, 0 par 0.07s 0.10s 0.0127s 0.0479s INIT time 0.00s ( 0.00s elapsed) MUT time 0.24s ( 0.24s elapsed) GC time 0.13s ( 0.17s elapsed) EXIT time 0.00s ( 0.01s elapsed) Total time 0.37s ( 0.41s elapsed) %GC time 34.8% (40.3% elapsed) Alloc rate 4,923,204,681 bytes per MUT second Productivity 65.2% of total user, 59.0% of total elapsed
$ ./HashMapInsertNew +RTS -s 1,191,223,128 bytes allocated in the heap 231,158,688 bytes copied during GC 55,533,064 bytes maximum residency (13 sample(s)) 22,378,488 bytes maximum slop 144 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 2268 colls, 0 par 0.06s 0.07s 0.0000s 0.0003s Gen 1 13 colls, 0 par 0.12s 0.16s 0.0127s 0.0468s INIT time 0.00s ( 0.00s elapsed) MUT time 0.25s ( 0.25s elapsed) GC time 0.18s ( 0.23s elapsed) EXIT time 0.00s ( 0.01s elapsed) Total time 0.43s ( 0.49s elapsed) %GC time 41.6% (47.5% elapsed) Alloc rate 4,738,791,249 bytes per MUT second Productivity 58.3% of total user, 51.9% of total elapsed
(Note that this is without the patches in #8885, so they're not the cause.)
An interesting difference is that we spend more time in GC in HEAD. I don't know if that's related.
Change History (32)
comment:1 Changed 19 months ago by tibbe
- Description modified (diff)
- Summary changed from unordered-containers 19% slower in HEAD vs 7.6.3 to unordered-containers 16% slower in HEAD vs 7.6.3
comment:10 Changed 19 months ago by tibbe
- Description modified (diff)
- Summary changed from unordered-containers 16% slower in HEAD vs 7.6.3 to Strictness analysis regression