Opened 5 years ago
Last modified 3 years ago
#8648 new bug
Initialization of C statics broken in threaded runtime
Reported by: | edsko | Owned by: | simonmar |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Runtime System | Version: | 7.7 |
Keywords: | Cc: | simonmar, snoyberg | |
Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
Type of failure: | None/Unknown | Test Case: | |
Blocked By: | Blocking: | ||
Related Tickets: | Differential Rev(s): | ||
Wiki Page: |
Description
Consider a tiny package static-value
, consisting of one Haskell file
foreign import ccall unsafe "returnStaticValue" c_returnStaticValue :: IO CInt printStaticValue :: IO () printStaticValue = print =<< c_returnStaticValue
and one corresponding C file
static int theStaticValue = 0; int returnStaticValue() { // Modify the static so the C compiler doesn't optimize it away return theStaticValue++; }
(test case is attached). If we call printStaticValue
using the GHC API:
runGhc (Just libdir) $ do flags0 <- getSessionDynFlags void $ setSessionDynFlags flags0 { hscTarget = HscInterpreted , ghcLink = LinkInMemory , ghcMode = CompManager } setContext $ [ IIDecl $ simpleImportDecl $ mkModuleName "StaticValue" ] _ <- runStmt "StaticValue.printStaticValue" RunToCompletion
then we see "0", as expected. However, if we compile this code using the threaded runtime, and we wrap the above code in a call to either forkIO
or forkOS
, then we see a different value printed (-907777, whatever that value is).
Some notes:
- I have been unable to reproduce this bug without using GHC as API; in particular, calling
printStaticValue
directly, wrapped inforkIO
orforkOS
or not, always works as expected. - If I change the initialization value of
staticValue
from 0 to anything else (say, 1234), we always get the right answer, never the uninitialized value. Presumably this is because non-zero values require some explicit code to be run (and it does get run), while a zero value gets initialized differently (and apparently, that's where the bug is). - I have reproduced this bug in both ghc 7.4 and 7.7.20131227.
This ticket is the result of tracking down a problem with calling createProcess
from within the GHC API, which would cause the parent process to stall. As it turns out, runProcess.c
(from the process
library) declares a static long max_fd = 0
, and in runInteractiveProcess
checks for this value to be 0, and if it is, does a syscall to figure out what the maximum FD is. But since this static does not get initialized properly (the bug reported in this ticket), it gets left at its (random? but always the same) value (281474975802879), so that the child process proceeds to close rather too many file descriptors (if close_fds was set to True) and the parent stalls. Indeed, changing the initialization to static long max_fd = -1
(and adjusting the later check for zero accordingly) fixes this (so this is a viable workaround in process
if we cannot track down the bug in GHC).
Attachments (2)
Change History (7)
Changed 5 years ago by
comment:1 Changed 5 years ago by
Changed 5 years ago by
Attachment: | static-value-0.1.0.0.tar.gz added |
---|
comment:2 Changed 5 years ago by
Also note that it works when the static-value
package is built as a dynamic lib and the top level exe uses dynamic libs.
So what it looks like is that the ghci linker is not zeroing the memory allocated for the zero-init (.bss) section from the object files.
Of course, we know the linker does have code to zero the .bss (it uses calloc), and it works when not using forkIO.
comment:3 Changed 4 years ago by
Cc: | snoyberg added |
---|
comment:4 Changed 4 years ago by
I *think* this bug is fixed by Phab:D975. There were bugs in the way we were allocating the BSS segment for dynamically linked code.
I should have mentioned, I can only reproduce this on Linux; on OSX I always get the right answer.