Opened 15 months ago

Closed 2 months ago

#7636 closed bug (fixed)

threadStackUnderflow: not enough space for return values

Reported by: mojojojo Owned by: simonmar
Priority: high Milestone: 7.6.2
Component: Compiler Version: 7.6.3
Keywords: Cc:
Operating System: MacOS X Architecture: x86_64 (amd64)
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

While filling a Control.Concurrent.STM.TBQueue (stm-2.4) with values the system crashes with the following message:

internal error: threadStackUnderflow: not enough space for return values
    (GHC version 7.4.2 for x86_64_apple_darwin)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Abort trap: 6

Change History (17)

comment:1 Changed 15 months ago by mojojojo

  • Architecture changed from Unknown/Multiple to x86_64 (amd64)

comment:2 Changed 15 months ago by thoughtpolice

Hi,

If possible, could you please attach or link to a minimal test case that reproduces the issue? It'll make diagnosing and fixing the issue much easier.

comment:3 Changed 15 months ago by mojojojo

My program has the following structure, but unfortunately that code does not reproduce the bug. I'm willing to answer any questions and if it will help I can post some partial code of my app, which won't compile due to being partial, but may help you with the analysis of the situation.

In the actual program in the eater and feeder functions there happen HTTP and database requests.

import Control.Monad
import Control.Monad.Trans
import Control.Concurrent
import Control.Concurrent.STM
import Control.Concurrent.STM.TBQueue
import qualified Data.Vector as Vector

main = do
  valuesQueue <- atomically $ newTBQueue 50000

  forkIO $ runEater valuesQueue

  runFeeder valuesQueue 0


runFeeder valuesQueue offset = do
  putStrLn $ "Feeding " ++ show feederBatchSize ++ " values"
  let values = replicate feederBatchSize $ "A"
  if offset >= 10000000
    then putStrLn "Reached the end"
    else do
      atomically $ forM_ values $ writeTBQueue valuesQueue
      runFeeder valuesQueue (offset + feederBatchSize)

runEater valuesQueue = do
  values <- atomically $ 
    Vector.replicateM eaterBatchSize (readTBQueue valuesQueue)
  putStrLn $ "Eating " ++ (show $ Vector.length values) ++ " values"
  runEater valuesQueue 



feederBatchSize = 10000
eaterBatchSize = 500

comment:4 Changed 15 months ago by igloo

  • Difficulty set to Unknown
  • Owner set to simonmar

I don't know how feasible fixing this will be without a way to reproduce it.

Are you able to test it with GHC 7.6.2, or better still, HEAD, please?

Assigning to Simon, who I think is most familiar with this code.

comment:5 Changed 15 months ago by mojojojo

  • Version changed from 7.4.2 to 7.6.1

Tested my app on 7.6.1. The message still appears, although on higher loads now.

amazon-data-importer: internal error: threadStackUnderflow: not enough space for return values
    (GHC version 7.6.1 for x86_64_apple_darwin)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Abort trap: 6

comment:6 Changed 15 months ago by mojojojo

I take the statement of it depending on loads back. Here's the video that describes the issue clearly:
http://screencast.com/t/1qltqzTG

comment:7 Changed 15 months ago by simonmar

Could I get you to do some debugging for me? I need you to compile the program with -debug, and then run it under gdb. When it crashes, you should be able to say bt and get a backtrace that includes threadStackUnderflow, in particular this bit of code:

        // we have some return values to copy to the old stack
        if ((W_)(new_stack->sp - new_stack->stack) < retvals)
        {
            barf("threadStackUnderflow: not enough space for return values");
        }

I need to know the values of retvals, new_stack->sp, and new_stack->sp.

I think this crash can occur, but it should be very very very rare. Before I fix it, I need to ensure that something else hasn't gone wrong to cause this crash to happen, because then I'd need to find the root cause before adding a workaround.

comment:8 Changed 15 months ago by simonmar

Oh, you may need to unpack a copy of the GHC source code, and point gdb at the source tree using the directory command. Make sure you get the right version of the source code.

comment:9 Changed 15 months ago by simonmar

  • Milestone set to 7.6.2
  • Priority changed from normal to high

comment:10 Changed 15 months ago by mojojojo

  • Version changed from 7.6.1 to 7.6.2

Here's what I did, but didn't get that output you expected:

huge-black-box-mac:src mojojojo$ ghc -debug AmazonDataImporter
[ 1 of 25] Compiling Database.PostgreSQL.Simple.QueryM ( Database/PostgreSQL/Simple/QueryM.hs, Database/PostgreSQL/Simple/QueryM.o )
[ 2 of 25] Compiling Database.PostgreSQL.Simple.SQL ( Database/PostgreSQL/Simple/SQL.hs, Database/PostgreSQL/Simple/SQL.o )
[ 3 of 25] Compiling Model.SearchSource ( Model/SearchSource.hs, Model/SearchSource.o )
[ 4 of 25] Compiling Util.PrettyPrint ( Util/PrettyPrint.hs, Util/PrettyPrint.o )
[ 5 of 25] Compiling OctalUTF8        ( OctalUTF8.hs, OctalUTF8.o )
[ 6 of 25] Compiling Model.SearchTask ( Model/SearchTask.hs, Model/SearchTask.o )
[ 7 of 25] Compiling Model.FileLogConfig ( Model/FileLogConfig.hs, Model/FileLogConfig.o )
[ 8 of 25] Compiling Model.OutputLogConfig ( Model/OutputLogConfig.hs, Model/OutputLogConfig.o )
[ 9 of 25] Compiling Model.Config     ( Model/Config.hs, Model/Config.o )
[10 of 25] Compiling Data.Aeson.Parser.Internal ( Data/Aeson/Parser/Internal.hs, Data/Aeson/Parser/Internal.o )
[11 of 25] Compiling Data.Aeson.Functions ( Data/Aeson/Functions.hs, Data/Aeson/Functions.o )
[12 of 25] Compiling Data.Aeson.Generic2 ( Data/Aeson/Generic2.hs, Data/Aeson/Generic2.o )
[13 of 25] Compiling Data.Yaml.Generic ( Data/Yaml/Generic.hs, Data/Yaml/Generic.o )
[14 of 25] Compiling VK.Database      ( VK/Database.hs, VK/Database.o )
[15 of 25] Compiling Model.AmazonDataImporterState ( Model/AmazonDataImporterState.hs, Model/AmazonDataImporterState.o )
[16 of 25] Compiling Util.Logging     ( Util/Logging.hs, Util/Logging.o )
[17 of 25] Compiling Database.PostgreSQL.Simple.QueryA ( Database/PostgreSQL/Simple/QueryA.hs, Database/PostgreSQL/Simple/QueryA.o )
[18 of 25] Compiling Database.PostgreSQL.Simple.Queries ( Database/PostgreSQL/Simple/Queries.hs, Database/PostgreSQL/Simple/Queries.o )
[19 of 25] Compiling Database.Amazon.Artist ( Database/Amazon/Artist.hs, Database/Amazon/Artist.o )
[20 of 25] Compiling Database.Amazon.MP3 ( Database/Amazon/MP3.hs, Database/Amazon/MP3.o )
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package pretty-1.1.1.0 ... linking ... done.
Loading package array-0.4.0.1 ... linking ... done.
Loading package deepseq-1.3.0.1 ... linking ... done.
Loading package containers-0.5.0.0 ... linking ... done.
Loading package bytestring-0.10.0.2 ... linking ... done.
Loading package text-0.11.2.3 ... linking ... done.
Loading package attoparsec-0.10.4.0 ... linking ... done.
Loading package blaze-builder-0.3.1.0 ... linking ... done.
Loading package old-locale-1.0.0.5 ... linking ... done.
Loading package time-1.4.0.1 ... linking ... done.
Loading package primitive-0.5.0.1 ... linking ... done.
Loading package vector-0.10.0.1 ... linking ... done.
Loading package blaze-textual-0.2.0.8 ... linking ... done.
Loading package postgresql-libpq-0.8.2.2 ... linking ... done.
Loading package template-haskell ... linking ... done.
Loading package transformers-0.3.0.0 ... linking ... done.
Loading package postgresql-simple-0.2.4.1 ... linking ... done.
[21 of 25] Compiling Database.Amazon  ( Database/Amazon.hs, Database/Amazon.o )
[22 of 25] Compiling Control.Concurrent.STM.TBQueue.Util ( Control/Concurrent/STM/TBQueue/Util.hs, Control/Concurrent/STM/TBQueue/Util.o )
[23 of 25] Compiling Control.Monad.Parallel ( Control/Monad/Parallel.hs, Control/Monad/Parallel.o )
[24 of 25] Compiling Config           ( Config.hs, Config.o )
[25 of 25] Compiling Main             ( AmazonDataImporter.hs, AmazonDataImporter.o )
Linking AmazonDataImporter ...
huge-black-box-mac:src mojojojo$ gdb AmazonDataImporter
GNU gdb 6.3.50-20050815 (Apple version gdb-1822) (Sun Aug  5 03:00:42 UTC 2012)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries ..... done

(gdb) directory ~/Downloads/ghc-ghc-7.6.2-release
Source directories searched: /Users/mojojojo/Downloads/ghc-ghc-7.6.2-release:$cdir:$cwd
(gdb) run
Starting program: /Users/mojojojo/Dropbox/Dev/radiox/vk-bot/src/AmazonDataImporter 
Reading symbols for shared libraries ++++.................................................... done
Reading symbols for shared libraries ....................... done
2013-02-06 20:01:56 MSK: Fetching 1000 tasks
AmazonDataImporter: Control.Monad.Trans.Resource.register': The mutable state is being accessed after cleanup. Please contact the maintainers.
2013-02-06 20:01:57 MSK: Fetching 1000 tasks
2013-02-06 20:01:58 MSK: Fetching 1000 tasks
2013-02-06 20:01:58 MSK: Fetching 1000 tasks
AmazonDataImporter: internal error: threadStackUnderflow: not enough space for return values
    (GHC version 7.6.2 for x86_64_apple_darwin)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

Program received signal SIGABRT, Aborted.
0x00007fff8e63fd46 in __kill ()
(gdb) bt
#0  0x00007fff8e63fd46 in __kill ()
#1  0x00007fff99debdf0 in abort ()
#2  0x00000001011830b0 in rtsFatalInternalErrorFn (s=Could not find the frame base for "rtsFatalInternalErrorFn".
) at rts/RtsMessages.c:170
#3  0x0000000101182c28 in barf (s=0x1012020d0 "threadStackUnderflow: not enough space for return values") at rts/RtsMessages.c:42
#4  0x000000010118fee5 in threadStackUnderflow (cap=0x10134e1c0, tso=0x103993bc0) at rts/Threads.c:686
#5  0x000000010118a3be in findRetryFrameHelper (cap=0x10134e1c0, tso=0x103993bc0) at rts/Schedule.c:2782
#6  0x00000001011ad11e in stg_retryzh ()
Previous frame inner to this frame (gdb could not unwind past this frame)
(gdb) 

The source directory I pointed it to was a root of unpacked archive I got from here: https://github.com/ghc/ghc/archive/ghc-7.6.2-release.zip

You'll have to provide some step-by-step instructions if you need something more.

comment:11 Changed 15 months ago by simonmar

Actually that helps me a lot, thanks! I think I see the bug, but I need to try to construct a test case myself that fails.

comment:12 Changed 15 months ago by mojojojo

Great. Good luck!

comment:13 Changed 15 months ago by marlowsd@…

commit 2f7044dee40ba6eadc1877ec49c30e1695d63fe4

Author: Simon Marlow <marlowsd@gmail.com>
Date:   Thu Feb 7 09:55:20 2013 +0000

    Tidy up tso->stackobj before calling threadStackUnderflow (#7636)
    
    Fixes the following crash:
    
      internal error: threadStackUnderflow: not enough space for return values
    
    when using STM.

 rts/Schedule.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

comment:14 Changed 15 months ago by simonmar

  • Status changed from new to merge

comment:15 Changed 11 months ago by errge

  • Version changed from 7.6.2 to 7.6.3

This came up for me today. I applied this one line patch to stock ghc 7.6.3 and the bug went away. Is there any chance to see this fix merged to the 7.6 branch and released with 7.6.4?

32-bit linux.

Thanks!

comment:16 Changed 7 months ago by bgamari

I believe all that remains to be done in this bug is to cherrypick 2f7044dee40 onto the ghc-7.6 branch. It seems to apply cleanly. Could someone handle this?

Last edited 7 months ago by bgamari (previous) (diff)

comment:17 Changed 2 months ago by thoughtpolice

  • Resolution set to fixed
  • Status changed from merge to closed

Closing. There won't be a 7.6.4 as we're doing 7.8 right now.

Note: See TracTickets for help on using tickets.