TBQueue leaks space under certain workloads
I'm using TBQueue and I noticed suspiciously high memory usage, so I decided to profile and it turned out that readTBQueue leaks space (see attached before.png).
After closer inspection it turned out it's the writeTVar rsize (r + 1)
in readTBQueue definition that's the problem - after substitution it for writeTVar rsize $! r + 1
the leak is gone (see attached after.png)
Here are -s outputs:
Before:
366,535,518,024 bytes allocated in the heap
115,643,281,224 bytes copied during GC
241,356,416 bytes maximum residency (1182 sample(s))
1,516,944 bytes maximum slop
392 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 247273 colls, 247273 par 128.854s 28.654s 0.0001s 0.0182s
Gen 1 1182 colls, 1181 par 352.162s 87.812s 0.0743s 0.1322s
Parallel GC work balance: 78.17% (serial 0%, perfect 100%)
TASKS: 24 (1 bound, 16 peak workers (23 total), using -N4)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.003s ( 0.003s elapsed)
MUT time 581.754s (226.191s elapsed)
GC time 317.130s ( 75.533s elapsed)
RP time 0.000s ( 0.000s elapsed)
PROF time 163.885s ( 40.933s elapsed)
EXIT time 0.013s ( 0.011s elapsed)
Total time 1062.789s (301.738s elapsed)
Alloc rate 630,052,684 bytes per MUT second
Productivity 54.7% of total user, 61.4% of total elapsed
gc_alloc_block_sync: 8998531
whitehole_spin: 96
gen[0].sync: 180553
gen[1].sync: 31648044
After:
431,671,260,464 bytes allocated in the heap
86,540,207,400 bytes copied during GC
170,338,336 bytes maximum residency (1381 sample(s))
1,159,472 bytes maximum slop
260 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 290179 colls, 290179 par 148.921s 33.097s 0.0001s 0.0217s
Gen 1 1381 colls, 1380 par 206.679s 51.492s 0.0373s 0.0528s
Parallel GC work balance: 75.51% (serial 0%, perfect 100%)
TASKS: 23 (1 bound, 17 peak workers (22 total), using -N4)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.005s ( 0.004s elapsed)
MUT time 681.718s (241.009s elapsed)
GC time 258.643s ( 60.370s elapsed)
RP time 0.000s ( 0.000s elapsed)
PROF time 96.957s ( 24.219s elapsed)
EXIT time 0.009s ( 0.007s elapsed)
Total time 1037.335s (301.390s elapsed)
Alloc rate 633,210,748 bytes per MUT second
Productivity 65.7% of total user, 71.9% of total elapsed
gc_alloc_block_sync: 5494680
whitehole_spin: 184
gen[0].sync: 184109
gen[1].sync: 24223953
Attached patch fixes the problem (I made all Int increments/decrements in the module strict as there is no need for them to be lazy).
Trac metadata
Trac field | Value |
---|---|
Version | 8.2.1 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | libraries (other) |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |