#7715 closed bug (duplicate)

threadDelay causes segfault on Mac if compiled by 32bit GHC

Reported by: kazu-yamamoto Owned by:
Priority: high Milestone: 7.8.1
Component: Compiler Version: 7.7
Keywords: Cc: pho@…
Operating System: MacOS X Architecture: x86
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

The following code causes segfault

main :: IO ()
main = do
    replicateM_ 100 $ forkIO $ do
        threadDelay 1000000 
        putStrLn "Hello, world!"
    threadDelay 5000000

if compiled with 32bit GHC head on Mac.

64bit GHC head does not cause this problem. 32bit GHC 7.4.2 does not, either. I don't see this bug both on FreeBSD and Linux.

"gdb" caught the following on each run:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000005
[Switching to process 51076 thread 0x20b]
0x00000005 in ?? ()
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000019
[Switching to process 50933 thread 0x20b]
0x00000019 in ?? ()
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x40e40348
[Switching to process 51004 thread 0x20b]
0x00f28aa5 in base_GHCziEventziPSQ_atMostzuzdszdwatMosts_info ()

Change History (17)

comment:1 Changed 14 months ago by igloo

  • Difficulty set to Unknown
  • Resolution set to duplicate
  • Status changed from new to closed

Thanks for the report. Happily, this sounds like a duplicate of the already-fixed #7299.

comment:2 Changed 14 months ago by kazu-yamamoto

  • Resolution duplicate deleted
  • Status changed from closed to new

I think this bug is different from #7299.

My base library has 8a3399d. And this is not a bug of GHCi, but compiled code.

comment:3 Changed 14 months ago by kazu-yamamoto

With 32bit GHC on Mac, this only occurs if "-threaded" is specified.

comment:4 Changed 13 months ago by PHO

  • Cc pho@… added

The segfault doesn't occur on GHC 7.6.2 for powerpc-apple-darwin (32bit).

comment:5 Changed 13 months ago by thorkilnaur

  • Version changed from 7.6.2 to 7.7

On the tn23 builder (http://darcs.haskell.org/ghcBuilder/builders/tn23/index.html):

$ uname -a
Darwin thorkil-naurs-intel-mac-mini.local 9.8.0 Darwin Kernel Version 9.8.0: Wed Jul 15 16:55:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_I386 i386
$ cat T7715A.hs 
import Control.Monad
import Control.Concurrent
main :: IO ()
main = do
    replicateM_ 100 $ forkIO $ do
        threadDelay 1000000 
        putStrLn "Hello, world!"
    threadDelay 5000000
$ /Users/thorkilnaur/tn/builders/GHCBuilder/tn23/builder/tempbuild/build/inplace/bin/ghc-stage2 --make T7715A.hs -threaded
[1 of 1] Compiling Main             ( T7715A.hs, T7715A.o )
Linking T7715A ...
$ ./T7715A 
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Hello, world!
Bus error
$ ./T7715A 
Hello, world!
Segmentation fault
$ ./T7715A 
Hello, world!
Bus error
$ 

Best regards
Thorkil

comment:6 Changed 13 months ago by igloo

  • Architecture changed from Unknown/Multiple to x86
  • Milestone set to 7.8.1
  • Operating System changed from Unknown/Multiple to MacOS X
  • Priority changed from normal to high

comment:7 Changed 12 months ago by kazu-yamamoto

This could be relating to #7043 and #7474.

comment:8 Changed 12 months ago by kazu-yamamoto

It appeared that this bug is not specific to 32bit Mac but IS specific to 32bit. I reproduced this bug on 32bit Linux.

comment:9 Changed 12 months ago by kazu-yamamoto

Since I suspect the atMost function, I exported GHC.Event.* by editing libraries/base/base.cabal and built the GHC.

With this GHC head on 32bit Mac, the following code sometime causes seqfault/bus error if compiled with -threaded:

module Main where

main :: IO ()
main = do
    s <- newSource
    ents <- replicateM 100 (entry s)
    let q = fold ents
    print $ Q.atMost 1.5 q

fold :: [(Q.Key, Q.Prio)] -> Q.PSQ ()
fold [] = Q.empty
fold ((u,r):xs) = Q.insert u r () $ fold xs

entry :: UniqueSource -> IO (Q.Key, Q.Prio)
entry s = do
    u <- newUnique s
    r <- randomIO
    return (u,r)

Now I believe that this test code makes debug much easier.

Note that if I copy PSQ.hs and Unique.hs, modify its modules name, and compile the Main.hs file with them by GHC head on 32bit Mac, it does not cause seqfault/bus error.

comment:10 Changed 12 months ago by kazu-yamamoto

I confirmed that this code causes segfault on 32bit Linux.

comment:11 Changed 12 months ago by igloo

I can't reproduce this, with a fast validate build and the code at the top of the ticket (plus imports for Control.Concurrent and Control.Monad):

$ ./test +RTS --info
 [("GHC RTS", "YES")
 ,("GHC version", "7.7.20130427")
 ,("RTS way", "rts_thr")
 ,("Build platform", "i386-apple-darwin")
 ,("Build architecture", "i386")
 ,("Build OS", "darwin")
 ,("Build vendor", "apple")
 ,("Host platform", "i386-apple-darwin")
 ,("Host architecture", "i386")
 ,("Host OS", "darwin")
 ,("Host vendor", "apple")
 ,("Target platform", "i386-apple-darwin")
 ,("Target architecture", "i386")
 ,("Target OS", "darwin")
 ,("Target vendor", "apple")
 ,("Word size", "32")
 ,("Compiler unregisterised", "NO")
 ,("Tables next to code", "YES")
 ]
$ time ./test | uniq
Hello, world!
                                   
real    0m5.019s
user    0m0.019s
sys     0m0.013s

comment:12 Changed 12 months ago by kazu-yamamoto

My test program on Mac displays the exactly same message.

Did you execute your test program more than once? It does not always causes a segfault.

Anyway, Thorkil could reproduce this. So, this problem actually exists.

comment:13 Changed 12 months ago by igloo

Yes, I tried several times.

I didn't mean to imply that it doesn't exist, just that I can't reproduce it (on OS X x86, at least). I'll see if I have more luck on Linux/x86 soon.

comment:14 Changed 12 months ago by kazu-yamamoto

Dynamic linking ("cabal install --enable-shared") does not solve this problem.

comment:15 Changed 11 months ago by igloo

I can't reproduce it on i386/Linux either.

comment:16 Changed 11 months ago by kazu-yamamoto

It appeared that this is a bug of the -O2 option. Please read and try this:

https://github.com/kazu-yamamoto/buggy-psq

comment:17 Changed 11 months ago by kazu-yamamoto

  • Resolution set to duplicate
  • Status changed from new to closed

This ticket was moved to: #7953

Note: See TracTickets for help on using tickets.