TBQueue leaks space under certain workloads

I'm using TBQueue and I noticed suspiciously high memory usage, so I decided to profile and it turned out that readTBQueue leaks space (see attached before.png).

After closer inspection it turned out it's the writeTVar rsize (r + 1) in readTBQueue definition that's the problem - after substitution it for writeTVar rsize $! r + 1 the leak is gone (see attached after.png)

Here are -s outputs:

Before:

 366,535,518,024 bytes allocated in the heap
 115,643,281,224 bytes copied during GC
     241,356,416 bytes maximum residency (1182 sample(s))
       1,516,944 bytes maximum slop
             392 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     247273 colls, 247273 par   128.854s  28.654s     0.0001s    0.0182s
  Gen  1      1182 colls,  1181 par   352.162s  87.812s     0.0743s    0.1322s

  Parallel GC work balance: 78.17% (serial 0%, perfect 100%)

  TASKS: 24 (1 bound, 16 peak workers (23 total), using -N4)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.003s  (  0.003s elapsed)
  MUT     time  581.754s  (226.191s elapsed)
  GC      time  317.130s  ( 75.533s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time  163.885s  ( 40.933s elapsed)
  EXIT    time    0.013s  (  0.011s elapsed)
  Total   time  1062.789s  (301.738s elapsed)

  Alloc rate    630,052,684 bytes per MUT second

  Productivity  54.7% of total user, 61.4% of total elapsed

gc_alloc_block_sync: 8998531
whitehole_spin: 96
gen[0].sync: 180553
gen[1].sync: 31648044

After:

 431,671,260,464 bytes allocated in the heap
  86,540,207,400 bytes copied during GC
     170,338,336 bytes maximum residency (1381 sample(s))
       1,159,472 bytes maximum slop
             260 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     290179 colls, 290179 par   148.921s  33.097s     0.0001s    0.0217s
  Gen  1      1381 colls,  1380 par   206.679s  51.492s     0.0373s    0.0528s

  Parallel GC work balance: 75.51% (serial 0%, perfect 100%)

  TASKS: 23 (1 bound, 17 peak workers (22 total), using -N4)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.005s  (  0.004s elapsed)
  MUT     time  681.718s  (241.009s elapsed)
  GC      time  258.643s  ( 60.370s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time   96.957s  ( 24.219s elapsed)
  EXIT    time    0.009s  (  0.007s elapsed)
  Total   time  1037.335s  (301.390s elapsed)

  Alloc rate    633,210,748 bytes per MUT second

  Productivity  65.7% of total user, 71.9% of total elapsed

gc_alloc_block_sync: 5494680
whitehole_spin: 184
gen[0].sync: 184109
gen[1].sync: 24223953

Attached patch fixes the problem (I made all Int increments/decrements in the module strict as there is no need for them to be lazy).

Trac metadata

Trac field	Value
Version	8.2.1
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	libraries (other)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Child items 0

No child items are currently assigned. Use child items to break down this issue into smaller parts.

TBQueue leaks space under certain workloads

Activity