Opened 3 months ago

Last modified 3 months ago

#8648 new bug

Initialization of C statics broken in threaded runtime

Reported by: edsko Owned by: simonmar
Priority: normal Milestone:
Component: Runtime System Version: 7.7
Keywords: Cc: simonmar
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

Consider a tiny package static-value, consisting of one Haskell file

foreign import ccall unsafe "returnStaticValue" c_returnStaticValue :: IO CInt

printStaticValue :: IO () 
printStaticValue = print =<< c_returnStaticValue

and one corresponding C file

static int theStaticValue = 0;

int returnStaticValue() {
  // Modify the static so the C compiler doesn't optimize it away
  return theStaticValue++;
}

(test case is attached). If we call printStaticValue using the GHC API:

runGhc (Just libdir) $ do
    flags0 <- getSessionDynFlags
    void $ setSessionDynFlags flags0 {
        hscTarget = HscInterpreted
      , ghcLink   = LinkInMemory
      , ghcMode   = CompManager
      }

    setContext $ [ IIDecl $ simpleImportDecl $ mkModuleName "StaticValue" ]
    _ <- runStmt "StaticValue.printStaticValue" RunToCompletion

then we see "0", as expected. However, if we compile this code using the threaded runtime, and we wrap the above code in a call to either forkIO or forkOS, then we see a different value printed (-907777, whatever that value is).

Some notes:

  • I have been unable to reproduce this bug without using GHC as API; in particular, calling printStaticValue directly, wrapped in forkIO or forkOS or not, always works as expected.
  • If I change the initialization value of staticValue from 0 to anything else (say, 1234), we always get the right answer, never the uninitialized value. Presumably this is because non-zero values require some explicit code to be run (and it does get run), while a zero value gets initialized differently (and apparently, that's where the bug is).
  • I have reproduced this bug in both ghc 7.4 and 7.7.20131227.

This ticket is the result of tracking down a problem with calling createProcess from within the GHC API, which would cause the parent process to stall. As it turns out, runProcess.c (from the process library) declares a static long max_fd = 0, and in runInteractiveProcess checks for this value to be 0, and if it is, does a syscall to figure out what the maximum FD is. But since this static does not get initialized properly (the bug reported in this ticket), it gets left at its (random? but always the same) value (281474975802879), so that the child process proceeds to close rather too many file descriptors (if close_fds was set to True) and the parent stalls. Indeed, changing the initialization to static long max_fd = -1 (and adjusting the later check for zero accordingly) fixes this (so this is a viable workaround in process if we cannot track down the bug in GHC).

Attachments (2)

T8648.hs (907 bytes) - added by edsko 3 months ago.
static-value-0.1.0.0.tar.gz (621 bytes) - added by edsko 3 months ago.

Download all attachments as: .zip

Change History (4)

Changed 3 months ago by edsko

comment:1 Changed 3 months ago by edsko

I should have mentioned, I can only reproduce this on Linux; on OSX I always get the right answer.

Changed 3 months ago by edsko

comment:2 Changed 3 months ago by duncan

Also note that it works when the static-value package is built as a dynamic lib and the top level exe uses dynamic libs.

So what it looks like is that the ghci linker is not zeroing the memory allocated for the zero-init (.bss) section from the object files.

Of course, we know the linker does have code to zero the .bss (it uses calloc), and it works when not using forkIO.

Last edited 3 months ago by duncan (previous) (diff)
Note: See TracTickets for help on using tickets.