High CPU when asynchronous exception and unblocking retry on TVar raced
Detail: https://github.com/nshimaza/race-tmvar-async-exception
Runtime falls into high CPU under racing condition between async exception and unblocking retry on TVar.
- Reproduces with +RTS -Nx where x > 1
- Does NOT reproduce with +RTS -N1
- Program stalls at
killThread
- High CPU based on given -Nx
- CPU won't be 100% if you gave x smaller than available hardware threads of your platform.
- Does NOT reproduce if TVar/retry is replaced by MVar
- Reproduced with GHC 8.4.2 (macOS High Sierra (10.13.4))
- Reproduced with GHC 8.4.2 (Docker for Mac Version 18.03.1-ce-mac65)
- Reproduced with ghc-8.5.20180506 (Docker for Mac Version 18.03.1-ce-mac65)
Minimal reproducing code here. (You can find more verbose code on the above github repo.)
main :: IO ()
main = do
let volume = 1000
forM_ [1..1000] $ \i -> do
putStrFlush $ show i ++ " "
-- Spawn massive number of threads.
threads <- replicateM volume $ do
trigger <- newTVarIO False
tid <- forkIO $ void $ atomically $ do
t <- readTVar trigger
if t then pure t else retry
pure (trigger, tid)
-- Make sure all threads are spawned.
threadDelay 30000
-- Let threads start to exit normally.
forkIO $ forM_ threads $ \(trigger, _) -> threadDelay 1 *> atomically (writeTVar trigger True)
-- Concurrently kill threads in order to create race.
-- TMVar operation and asynchronous exception can hit same thread simultaneously.
-- Adjust threadDelay if you don't reproduce very well.
threadDelay 1000
forM_ threads $ \(_, tid) -> do
putCharFlush 'A'
killThread tid -- When the issue reproduced, this killThread doesn't return.
putCharFlush '\b'
This program intentionally creates race condition between asynchronous exception
and unblocking operation of retry
on TVar. From one side, a writeTVar trigger True
is attempted from external thread while target thread is blocking
at retry
on the same TVar
. On the other side, an asynchronous exception
ThreadKilled
is thrown by yet another external thread to the same target
thread.
In other word, it attempts to kill a thread about to unblock.
I guess when the above two operation hit the same thread at the same time in parallel in SMP environment, GHC runtime falls into high CPU.
Trac metadata
Trac field | Value |
---|---|
Version | 8.4.2 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | highest |
Resolution | Unresolved |
Component | Runtime System |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |