Opened 3 months ago

Last modified 3 months ago

#15595 new bug

Stack overflow in withArgs leads to infinite memory-consuming loop

Reported by: NeilMitchell Owned by:
Priority: normal Milestone: 8.6.1
Component: Compiler Version: 8.5
Keywords: Cc: ndmitchell@…
Operating System: Windows Architecture: x86_64 (amd64)
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Given the program:

import System.Environment

main :: IO ()
main = do
    putStrLn "Starting"
    withArgs (replicate 1000 "") $ return ()

When run with:

ghc --make WithArgsBug.hs -rtsopts && WithArgsBug +RTS -K1K

The program prints out "Starting", then loops forever, taking 1 CPU core and allocating memory (approx 1Gb per min), until the computer is unresponsive. The program does not respond to Ctrl-C and has to be killed from the task manager.

The -K1K flag limits the stack to approx 33Kb.

Change History (5)

comment:1 Changed 3 months ago by osa1

Version: 8.4.38.5

I did a little bit of debugging -- basically the RTS is throwing a stack overflow exception, but the mutator is then trying to allocate more stack space (maybe because stack overflow exception is somehow masked?), causing a loop.

A different variant of this program exits with a stack overflow exception:

{-# LANGUAGE ForeignFunctionInterface #-}

import Foreign.C.Types
import Foreign.C.String
import Foreign.C
import Foreign.Ptr
import GHC.Foreign (withCStringsLen)
import GHC.IO.Encoding (utf8)

foreign import ccall unsafe "setProgArgv"
  c_setProgArgv  :: CInt -> Ptr CString -> IO ()

main :: IO ()
main = do
    putStrLn "Starting"
    withCStringsLen utf8 (replicate 1000 "") $ \len css -> do
      c_setProgArgv (fromIntegral len) css

Output:

$ ./Main +RTS -K1K
Starting
Main: Stack space overflow: current size 33624 bytes.
Main: Use `+RTS -Ksize -RTS' to increase it.

I don't know why the withArgs version doesn't fail with the same error yet, but I think Note [Throw to self when masked] is relevant.

(Confirmed on GHC HEAD so updating the version)

Last edited 3 months ago by osa1 (previous) (diff)

comment:2 Changed 3 months ago by osa1

Right, so this is because the thread is in masked state and stack overflow exception is not actually raised because of this. If I change the program above to this:

{-# LANGUAGE ForeignFunctionInterface #-}

import Foreign.C
import Foreign.Ptr
import GHC.Foreign (withCStringsLen)
import GHC.IO.Encoding (utf8)
import Control.Exception (mask_)

foreign import ccall unsafe "setProgArgv"
  c_setProgArgv  :: CInt -> Ptr CString -> IO ()

main :: IO ()
main = do
    putStrLn "Starting"
    mask_ $
      withCStringsLen utf8 (replicate 1000 "") $ \len css -> do
        c_setProgArgv (fromIntegral len) css
    putStrLn "Done"

(only difference is that I added a mask_) this also loops.

Not sure about what's the right thing to do here ...

comment:3 Changed 3 months ago by osa1

The reason why the original code loops while my first example doesn't is because I don't use getArgs. getArgs calls this function:

withProgArgv :: [String] -> IO a -> IO a
withProgArgv new_args act = do
  pName <- System.Environment.getProgName
  existing_args <- System.Environment.getArgs
  bracket_ (setProgArgv new_args)
           (setProgArgv (pName:existing_args))
           act

The setProgArgv new_args part is where we get a stack overflow, but it's evaluated in masked state because of bracket_:

bracket_ before after thing = bracket before (const after) (const thing)

bracket before after thing =
  mask $ \restore -> do
    a <- before
    r <- restore (thing a) `onException` after a
    _ <- after a
    return r

comment:4 Changed 3 months ago by osa1

Here's another reproducer:

import Control.Exception (mask_)

main :: IO ()
main = mask_ (print (foldl (+) 0 [0 .. 1000000]))

If you remove mask_ this fails with stack overflow (add +RTS -K1K). With mask_ it gets stuck in the same loop as the original program.

comment:5 Changed 3 months ago by simonmar

Isn't this the correct behaviour? StackOverflow is an asynchronous exception, so we can't throw it inside mask. I don't understand why the program loops, though.

Note: See TracTickets for help on using tickets.