Parallel GC increases CPU load while slowing down program
I noticed this issue with a lot of my programs. I have no idea if this is a widely know issue or if I'm just particularly unluckily and/or unskilled when it comes to the GHC GC, but I thought it might be worth reporting as a bug.
Here's a fairly simple program showing the issue:
https://github.com/blitzcode/haskell-gol/tree/master/vector-glfwb
(Note the 'GHC.Conc.getNumProcessors >>= setNumCapabilities', need to remove that for testing)
On my quad core machine, this simple (non-parallel, some concurrency for draw & compute) Game-of-Life program runs as follows:
+RTS -N1 = ~520G/s, CPU Load ~100% +RTS -N2 = ~505G/s, CPU Load ~135% +RTS -N3 = ~485G/s, CPU Load ~150% +RTS -N4 = ~485G/s, CPU Load ~160%
Specifying -qg1 caps the CPU load increase at ~135% and it won't slow down below ~505G/s. The statistics from +RTS -s also suggest a decrease in GC time / increase in productivity through using -qg1. The program is a bit crummy, but it's the shortest example of this I got at hand. I've seen this in many different programs, serial GC just seems to be faster for a lot of workloads.
I think it might at least be helpful to improve documentation a bit, suggesting some things to try for a GC speedup etc. Apologies if this is already a well-known issue or if I'm just doing something obviously dumb here that makes the GC perform poorly.
Trac metadata
Trac field | Value |
---|---|
Version | 7.6.3 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Runtime System |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | simonmar |
Operating system | |
Architecture |