Opened 4 years ago

Closed 3 years ago

#5553 closed bug (worksforme)

sendWakeup error in simple test program with MVars and killThread

Reported by: bit Owned by: tibbe
Priority: high Milestone: 7.4.2
Component: Runtime System Version: 7.2.1
Keywords: Cc: johan.tibell@…, roma@…
Operating System: Linux Architecture: x86
Type of failure: Incorrect result at runtime Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

The following test program causes a sendWakeup error to be printed. It happens rarely, not on every run of the program.

I'm running GHC 7.2.1 on a fairly old Linux 2.6.27 system.

Running it from the shell in a loop should cause it to eventually display the error message. I found that by causing CPU activity (such as running "yes" in another terminal) while the shell loop below is running triggers the error.

$ ghc --make -Wall -O -threaded -rtsopts ghc_sendWakeup_bug.hs
$ while [ 1 ]; do ./ghc_sendWakeup_bug 40; done
ghc_sendWakeup_bug: sendWakeup: invalid argument (Bad file descriptor)

ghc_sendWakeup_bug.hs

module Main
    ( startTest
    , main
    ) where

import Control.Concurrent (ThreadId, forkIO, killThread, threadDelay)
import Control.Concurrent.MVar
import Control.Exception (finally, catch, SomeException, mask_)
import Control.Monad (when, replicateM_, forever)
import Prelude hiding (catch)
import System.Environment (getArgs, getProgName)
import System.Exit (exitFailure)
import System.IO (hPutStrLn, stderr)

startClient :: IO ()
startClient = threadDelay (1000 * 10)

startTest :: Int -> IO ()
startTest numClients = do
    -- Code adapted from:
    -- http://hackage.haskell.org/packages/archive/base/4.4.0.0/doc/html/Control-Concurrent.html#g:12
    children <- newMVar [] :: IO (MVar [MVar ()])

    let forkChild :: IO () -> IO ThreadId
        forkChild io = do
            mvar <- newEmptyMVar
            mask_ $ do
                modifyMVar_ children (return . (mvar:))
                forkIO (io `finally` putMVar mvar ())
        waitForChildren :: IO ()
        waitForChildren = do
            cs <- takeMVar children
            case cs of
                [] -> return ()
                m:ms -> do
                    putMVar children ms
                    takeMVar m
                    waitForChildren

    serverThread <- forkIO $ forever (threadDelay 1000000)

    replicateM_ numClients (forkChild startClient)
    catch waitForChildren (printException "waitForChildren")
    catch (killThread serverThread) (printException "killThread")

printException :: String -> SomeException -> IO ()
printException place ex =
    hPutStrLn stderr $ "Error in " ++ place ++ ": " ++ show ex

main :: IO ()
main = do
    args <- getArgs
    when (length args /= 1) $ do
        prog <- getProgName
        hPutStrLn stderr $ "Usage: " ++ prog ++ " <numClients>"
        exitFailure
    let numClients = read (args !! 0)
    startTest numClients

Change History (9)

comment:1 Changed 4 years ago by tibbe

  • Cc johan.tibell@… added

comment:2 Changed 4 years ago by tibbe

  • Owner set to tibbe

I've assigned the ticket to myself but I'm pretty swamped right now so if someone else has time feel free to take a look.

sendWakeup is defined in GHC/Event/Control.hs and is used to wake up the I/O manager every time a new file descriptor or timeout (i.e. threadDelay) is added. Here's the relevant code:

sendWakeup :: Control -> IO ()
#if defined(HAVE_EVENTFD)
sendWakeup c = alloca $ \p -> do
  poke p (1 :: Word64)
  throwErrnoIfMinus1_ "sendWakeup" $
    c_write (fromIntegral (controlEventFd c)) (castPtr p) 8
#else
sendWakeup c = do
  n <- sendMessage (wakeupWriteFd c) CMsgWakeup
  case n of
    _ | n /= -1   -> return ()
      | otherwise -> do
                   errno <- getErrno
                   when (errno /= eAGAIN && errno /= eWOULDBLOCK) $
                     throwErrno "sendWakeup"
#endif

Since you're on Linux the first #if case applies.

comment:3 Changed 4 years ago by Feuerbach

Couldn't reproduce here, even by loading all cores by 100% and setting numClients to 10000. GHC 7.2.1, Linux 2.6.32.

comment:4 Changed 4 years ago by Feuerbach

  • Cc roma@… added

comment:5 Changed 4 years ago by igloo

  • Milestone set to 7.4.1
  • Priority changed from normal to high

comment:6 Changed 4 years ago by michalt

I can't reproduce it either. Tried GHC 7.2.2 and HEAD with gcc 4.6.2, Linux 3.1.1.

comment:7 Changed 4 years ago by igloo

  • Milestone changed from 7.4.1 to 7.4.2

comment:8 follow-up: Changed 3 years ago by bit

I am the original reporter of this bug.

I would just like to report that ghc 7.4.1 seems to have resolved this bug, and I am no longer getting the error from the test program.

comment:9 in reply to: ↑ 8 Changed 3 years ago by simonmar

  • difficulty set to Unknown
  • Resolution set to worksforme
  • Status changed from new to closed

Replying to bit:

I am the original reporter of this bug.

I would just like to report that ghc 7.4.1 seems to have resolved this bug, and I am no longer getting the error from the test program.

Thanks!

Note: See TracTickets for help on using tickets.