Opened 6 years ago

Closed 6 years ago

#2747 closed merge (fixed)

Excessive Memory Usage: space leak with foldl' on Integer

Reported by: guest Owned by: igloo
Priority: normal Milestone: 6.10.2
Component: Runtime System Version: 6.10.1
Keywords: Cc: rakowski@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

"foldl1' (*) [1..1000000]" for example allocates up to 1.9 GB of Memory in about 10 seconds and stops with "out of memory".

I guess I don't have to mention that with 6.8.3 it runs with only as much as 20MB.

Change History (8)

comment:1 Changed 6 years ago by dons

  • Architecture changed from x86_64 (amd64) to Unknown/Multiple
  • Component changed from Runtime System to Compiler
  • Operating System changed from Windows to Unknown/Multiple
  • Summary changed from Excessive Memory Usage to Excessive Memory Usage: space leak with foldl' on Integer

Or, more usefully,

import Data.List
main = print (foldl' (*) 1 [1..1000000])

Compiled with ghc 6.10 RC with -O2, leaks space,

$ time ./A +RTS -sstderr
./A +RTS -sstderr 
^C      60,664,416 bytes allocated in the heap
          39,520 bytes copied during GC
          18,856 bytes maximum residency (1 sample(s))
          20,856 bytes maximum slop
            3236 MB total memory in use (1015 MB lost due to fragmentation)

  Generation 0:   116 collections,     0 parallel,  0.12s,  0.09s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time   68.36s  ( 71.00s elapsed)
  GC    time    0.12s  (  0.09s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time   68.48s  ( 71.08s elapsed)

  %GC time       0.2%  (0.1% elapsed)

  Alloc rate    887,353 bytes per MUT second

  Productivity  99.8% of total user, 96.2% of total elapsed


./A +RTS -sstderr  68.48s user 2.69s system 99% cpu 1:11.26 total

comment:2 Changed 6 years ago by nominolo

A heap profile shows ARR_WORDS as the culprit. The compiling the program with -O with either version gives an equivalent inner loop:

-- 6.8.2
Rec {
Main.lgo :: GHC.Num.Integer -> [GHC.Num.Integer] -> GHC.Num.Integer
[GlobalId]
[Arity 2
 NoCafRefs
 Str: DmdType SS]
Main.lgo =
  \ (z1_aw1 :: GHC.Num.Integer) (ds_aw2 :: [GHC.Num.Integer]) ->
    case ds_aw2 of wild_aw3 {
      [] -> z1_aw1;
      : x_aw7 xs1_aw8 ->
        case GHC.Num.timesInteger z1_aw1 x_aw7 of tpl_awa { __DEFAULT ->
        Main.lgo tpl_awa xs1_aw8
        }
    }
end Rec }
-- HEAD (6.11.20081103)
Rec {
Main.lgo :: GHC.Integer.Internals.Integer
            -> [GHC.Integer.Internals.Integer]
            -> GHC.Integer.Internals.Integer
[GlobalId]
[Arity 2
 NoCafRefs
 Str: DmdType SS]
Main.lgo =
  \ (z_ay4 :: GHC.Integer.Internals.Integer)
    (ds_ay5 :: [GHC.Integer.Internals.Integer]) ->
    case ds_ay5 of wild_ay6 {
      [] -> z_ay4;
      : x_aya xs_ayb ->
        case GHC.Integer.timesInteger z_ay4 x_aya of z'_ayd { __DEFAULT ->
        Main.lgo z'_ayd xs_ayb
        }
    }
end Rec }

My guess therefore is that GHC.Integer.timesInteger is the culprit.

comment:3 Changed 6 years ago by nominolo

  • Component changed from Compiler to Runtime System

Ok, I'm pretty sure now it's a GC bug. Compare:

-- 6.8.2
   Alloc    Copied     Live    GC    GC     TOT     TOT  Page Flts
    bytes     bytes     bytes  user  elap    user    elap
   525284      2196     20480  0.00  0.00    0.01    0.01    0    0  (Gen:  1)
   522484      1452     20480  0.00  0.00    0.01    0.01    0    0  (Gen:  0)
   522460      1768     20480  0.00  0.00    0.01    0.02    0    0  (Gen:  0)
   522232      2092     20480  0.00  0.00    0.02    0.02    0    0  (Gen:  0)
   521508      2260     20480  0.00  0.00    0.02    0.02    0    0  (Gen:  0)
   521508      2432     20480  0.00  0.00    0.02    0.03    0    0  (Gen:  0)
   521508      2604     20480  0.00  0.00    0.02    0.03    0    0  (Gen:  0)
   521508      2776     20480  0.00  0.00    0.02    0.03    0    0  (Gen:  0)
   521508      2952     20480  0.00  0.00    0.03    0.03    0    0  (Gen:  0)
   521508      3128     20480  0.00  0.00    0.03    0.04    0    0  (Gen:  0)
   521508      3304     20480  0.00  0.00    0.03    0.04    0    0  (Gen:  0)
   592356        48     24576  0.00  0.00    0.04    0.04    0    0  (Gen:  0)
   538120        48     24576  0.00  0.00    0.04    0.05    0    0  (Gen:  0)
     ... etc ...
 89,691,972 bytes allocated in the heap
    432,780 bytes copied during GC (scavenged)
     37,336 bytes copied during GC (not scavenged)
     20,480 bytes maximum residency (1 sample(s))

        168 collections in generation 0 (  0.01s)
          1 collections in generation 1 (  0.00s)

          2 Mb total memory in use
-- HEAD (6.11.20081103)
$ ./tmp +RTS -Sstderr
./tmp +RTS -Sstderr 
    Alloc    Copied     Live    GC    GC     TOT     TOT  Page Flts
    bytes     bytes     bytes  user  elap    user    elap
   525412      2256      4328  0.00  0.05    0.01    0.09    0    0  (Gen:  1)
   522484      2644      4772  0.00  0.00    0.01    0.10    0    0  (Gen:  0)
   522460      1764      5088  0.00  0.00    0.01    0.10    0    0  (Gen:  0)
   522232      2088      5412  0.00  0.00    0.02    0.10    0    0  (Gen:  0)
   521508      2256      5580  0.00  0.00    0.02    0.11    0    0  (Gen:  0)
   521508      2428      5752  0.00  0.00    0.02    0.11    0    0  (Gen:  0)
   521508      2600      5924  0.00  0.00    0.03    0.11    0    0  (Gen:  0)
   521508      2772      6096  0.00  0.00    0.03    0.12    0    0  (Gen:  0)
   521508      2948      6272  0.00  0.00    0.03    0.12    0    0  (Gen:  0)
   521508      3124      6448  0.00  0.00    0.04    0.12    0    0  (Gen:  0)
   521508      3300      6624  0.00  0.00    0.04    0.13    0    0  (Gen:  0)
   520256        44     14380  0.00  0.00    0.34    0.54    0    0  (Gen:  0)
...
       7,874,256 bytes allocated in the heap
         468,692 bytes copied during GC
           4,328 bytes maximum residency (1 sample(s))
          21,944 bytes maximum slop
              46 MB total memory in use (1 MB lost due to fragmentation)

(The memory usage is lower because I only run up to 10000). It looks to me like the GC is growing the heap even though it shouldn't. (Or could it be that the ARR_WORDS don't count differently?)

comment:4 Changed 6 years ago by nominolo

  • Owner set to nominolo
  • Status changed from new to assigned

Here's a small abstract of the output when run debugged RTS and +RTS -Dg (GC debugging):

allocated 1 megablock(s) at 0xae00000
allocated 1 megablock(s) at 0xaf00000
allocated 1 megablock(s) at 0xb000000
allocated 1 megablock(s) at 0xb100000
allocated 1 megablock(s) at 0xb200000
allocated 1 megablock(s) at 0xb300000
GC (gen 0): 151940 KB to collect, 152 MB in use, using 1 thread(s)
Memory inventory:
  gen 0 blocks : 37859 blocks (147 MB)
  gen 1 blocks :     4 blocks (0 MB)
  nursery      :   128 blocks (0 MB)
  retainer     :     0 blocks (0 MB)
  arena blocks :     0 blocks (0 MB)
  exec         :     0 blocks (0 MB)
  free         :   617 blocks (2 MB)
  total        : 38608 blocks (150 MB)
alloc new todo block 0x6d08000 for step 1
GC thread 0 working
scavenging block 0x6d08000 (gen 0, step 1) @ 0x6d08000
   scavenged 40 bytes
GC thread 0 idle (0 still running)
GC thread 0 finished.
GC thread 0 working

As you can see, it's adding megablocks (1MB) to the first generation.

comment:5 Changed 6 years ago by nominolo

  • Owner changed from nominolo to simonmar
  • Status changed from assigned to new

comment:6 Changed 6 years ago by simonmar

  • difficulty set to Unknown
  • Milestone set to 6.10.2

Testing the fix right now. We discovered the cause: basically the RTS isn't triggering a GC often enough, because some accounting got accidentally removed during some other refactoring. Large objects aren't being counted when deciding whether to GC, so a program that allocates all or mostly large objects can use a large amount of memory between GCs.

comment:7 Changed 6 years ago by simonmar

  • Owner changed from simonmar to igloo
  • Type changed from run-time performance bug to merge

Fixed:

Thu Nov  6 03:37:14 PST 2008  Simon Marlow <[email protected]>
  * allocateInGen(): increase alloc_blocks (#2747)

comment:8 Changed 6 years ago by igloo

  • Resolution set to fixed
  • Status changed from new to closed

Merged

Note: See TracTickets for help on using tickets.