(super!) linear slowdown of parallel builds on 40 core machine
I'm seeing slowdowns in parallel builds of a (simple!!) 6 module project when I
build it on a 40 core server that I'm using for work. For any given ghc invocation
with -jn
, once n>10, I start to see a super-linear slow down as a function of n,
here are some basic numbers,
compile time | |
---|---|
-j1 | 0m2.693s |
-j4 | 0m2.507s |
-j10 | 0m2.763s |
-j25 | 0m12.634s |
-j30 | 0m39.154s |
-j40 | 0m57.511s |
-j60 | 2m21.821s |
These timings are another 2-4x worse if ghc is invoked indirectly via cabal-install / Setup.hs
According to the linux utility latencytop, 100% of ghc's cpu time was spent on user-space lock contention when I did the -j40
invocation.
The timing in the -j40
case stayed the same even when ghc was also passed -O0
(and -fforce-recomp
to ensure it did the same )
A bit of experimentation makes me believe that *any* cabalized project on a 40 core machine will exhibit this performance issue.
cabal clean
cabal configure --ghc-options="-j"
cabal build -j1
should be enough to trigger the lock contention.
That said, I'll try to cook up a minimal repro that i can share the source for post haste.