Unnecessary Heap Allocations - Slow Performance
Using the vector library operations that should in principle take place locally and fast, are slow and build a large heap. While trying to analyse what is going on a strange effect showed. Compiling the attached small program with heap profiling support produced an executable that runs fast and uses the heap as expected, whereas built without profiling support it is slow. The effect shows on linux architectures amd64 and i386, using ghc 7.6.1 and 7.4.1, respectively.
- With profiling support
ghc --make -rtsopts -threaded -O2 -prof -fprof-auto heapAllocVec2.hs
./heapAllocVec2 +RTS -s -RTS 3628800
produces
fromList [3628800]
667,829,536 bytes allocated in the heap
125,768 bytes copied during GC
65,560 bytes maximum residency (2 sample(s))
20,096 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
...
Total time 0.34s ( 0.35s elapsed)
- Without profiling support
ghc --make -rtsopts -threaded -O2 heapAllocVec2.hs
./heapAllocVec2 +RTS -s -RTS 3628800
fromList [3628800]
26,098,406,816 bytes allocated in the heap
22,674,848 bytes copied during GC
47,184 bytes maximum residency (2 sample(s))
22,448 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
...
Total time 10.99s ( 11.06s elapsed)