Opened 3 months ago

Closed 3 months ago

#14171 closed bug (fixed)

STM causes program to suddenly exit

Reported by: MichaelBurge Owned by: bgamari
Priority: highest Milestone: 8.4.1
Component: libraries/stm Version: 8.2.1
Keywords: Cc:
Operating System: Linux Architecture: Unknown/Multiple
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: #8091 Differential Rev(s): Phab:D3919
Wiki Page:


Observed behavior:

  • The below program exits with return code 0 between the call to 'error "derp2"' and the call to 'error "derp"'
  • If the statement 'error "derp2"' is uncommented, the program will terminate with an exception.
  • The program only exits without output with -O. And in particular, with no-ignore-interface-pragmas.

Expected behavior:

  • The program should terminate with an exception regardless of whether 'error "derp2"' is commented out or not.
module Main where

import Control.Concurrent.STM
import Control.Concurrent.STM.TVar

data A = A String deriving (Eq, Show)

data E = E {
  a :: TVar [Int],
  b :: TVar A,
  c :: TVar [Int]

consistency_1 :: E -> STM Bool
consistency_1 = \e -> do
  _ <- readTVar $ c e
  return True

installSanityChecks :: E -> IO ()
installSanityChecks e = do
  x e
  error "derp"
x e = do
  atomically $ mapM_ installCheck [
  -- error "derp2"
    installCheck check = always $ check e

main :: IO ()
main = do
  state <- initialize
  installSanityChecks state

initialize :: IO E
initialize = E <$> newTVarIO [] <*> newTVarIO (A "USD") <*> newTVarIO []

Build options: (Remove -O and it will show the error)

/home/mburge/tmp/ghc/ghc-8.2.1/build/bin/ghc \
    -O \
    -package-id base- \
    -package-id stm- \
    app/Main.hs \
    -package-db /home/mburge/.stack/snapshots/x86_64-linux/nightly-2017-08-24/8.2.1/pkgdb

I used a fresh copy of GHC installed from here:

Change History (8)

comment:1 Changed 3 months ago by bgamari

Owner: set to bgamari
Priority: normalhighest

This sounds quite bad. Furthermore, it seems that check# has absolutely no test coverage.

I suspect this is due to our treatment of STM in the demand analyser; x in the above program ends up looking like this after strictness analysis,

x_s2KO :: E -> State# RealWorld -> (# State# RealWorld, () #)
  = \ (e_a2hN [Dmd=<xB,U(A,A,U(U))>] :: E)
      (eta_B1 [Dmd=<B,U>] :: State# RealWorld) ->
        @ ()
        (\ (eta_a2K4 [Dmd=<B,U>, OS=OneShot] :: State# RealWorld) ->
           case catchRetry#
                  @ ()
                  (\ (s_a2K5 [Dmd=<B,U>, OS=OneShot] :: State# RealWorld) ->
                     case e_a2hN of
                     { E ds_d2EV [Dmd=<B,A>] ds_d2EW [Dmd=<B,A>]
                         ds_d2EX [Dmd=<B,1*U(U)>] ->
                     case ds_d2EX of { GHC.Conc.Sync.TVar tvar#_a2JR [Dmd=<B,U>] ->
                     case readTVar# @ RealWorld @ [Int] tvar#_a2JR s_a2K5
                     { (# ipv_a2JG [Dmd=<B,U>], ipv1_a2JH [Dmd=<B,A>] #) -> retry# @ () ipv_a2JG }
           { (# ipv_a2Kj [Dmd=<S,U>], ipv1_a2Kk [Dmd=<L,A>] #) ->
           case check#
                  @ ()
                  (\ (s_a2Ki [Dmd=<S,U>] :: State# RealWorld) ->
                     case e_a2hN of
                     { E ds_d2EV [Dmd=<L,A>] ds_d2EW [Dmd=<L,A>]
                         ds_d2EX [Dmd=<S(S),1*U(U)>] ->
                     case ds_d2EX of { GHC.Conc.Sync.TVar tvar#_a2JR [Dmd=<S,U>] ->
                     case readTVar# @ RealWorld @ [Int] tvar#_a2JR s_a2Ki
                     { (# ipv_a2JG [Dmd=<S,U>], ipv1_a2JH [Dmd=<L,A>] #) ->
                     (# ipv_a2JG, GHC.Tuple.() #)
           of s'_a2Kv [Dmd=<S,U>]
           { __DEFAULT -> (# s'_a2Kv, GHC.Tuple.() #) }

Which I believe then leads us to conclude that the failure in installSanityChecks is redundant. Concerningly, this doesn't seem to be caught by the -fcatch-bottoms flag introduced in #13916.

I'll try to come back to this tomorrow or after ICFP.

Last edited 3 months ago by bgamari (previous) (diff)

comment:2 Changed 3 months ago by Ben Gamari <ben@…>

In dd643bcc/ghc:

Improve stm haddocks

While looking at #14171 I noticed these readability issues. Fix them.

comment:3 Changed 3 months ago by bgamari

I had another quick look at this and believe that the critical mistake is this,

  dmd = <C(C(S(SL))),C1(C1(U(U,A)))>
  expr = catchRetry#
           @ ()
           (\ (s_a2KK [Dmd=<B,U>, OS=OneShot] :: State# RealWorld) ->
              case readTVar# @ RealWorld @ [Int] ww_s485 s_a2KK of
              { (# ipv_a2Kl [Dmd=<B,U>], ipv1_a2Km [Dmd=<B,A>] #) ->
              retry# @ () ipv_a2Kl
  fun dmd_ty = <xC(S),1*C1(U)><L,1*C1(U)><L,U>
  arg dmd = <xC(S),1*C1(U)>
  arg dmd_ty = ([s485 :-> <xB,U>], b)
  res dmd_ty = <L,1*C1(U)><L,U>
  overall res dmd_ty = <L,1*C1(U)><L,U>b {s485-><xB,U>}

Note that arg dmd_ty bottoms, due to its use of retry#. However, arg dmd is ExnStr and therefore should catch this bottom. However, overall res dmd_ty nevertheless bottoms. I'm not yet sure why, but it's quite intriguing. This all seems quite reminiscentv of #13916, but it doesn't seem to be the same issue.

comment:4 Changed 3 months ago by bgamari

I think I see what is going on here: postProcessDmdResult looks only for ThrowsExn DmdResult when computing termination,

postProcessDmdResult :: Str () -> DmdResult -> DmdResult
postProcessDmdResult Lazy           _         = topRes
postProcessDmdResult (Str ExnStr _) ThrowsExn = topRes  -- Key point!
postProcessDmdResult _              res       = res

However, the demand being analysed here is Diverges, not ThrowsExn (again, due to the use of retry#). One "solution" here is,

  • compiler/basicTypes/Demand.hs

    diff --git a/compiler/basicTypes/Demand.hs b/compiler/basicTypes/Demand.hs
    index dfff0a2c92..f56d28c4a9 100644
    a b postProcessDmdType du@(JD { sd = ss }) (DmdType fv _ res_ty) 
    14401441postProcessDmdResult :: Str () -> DmdResult -> DmdResult
    14411442postProcessDmdResult Lazy           _         = topRes
    14421443postProcessDmdResult (Str ExnStr _) ThrowsExn = topRes  -- Key point!
     1444postProcessDmdResult (Str ExnStr _) Diverges  = topRes
    14431445postProcessDmdResult _              res       = res

This allows the program malfunctioning in this ticket to run as expected. However, it's not clear to me that this is correct: the ThrowsExn/Diverges distinction isn't defined in the original demand analysis paper nor any of the later papers that I have found. Moreover, postProcessDmdResult was written in 9915b6564403a6d17651e9969e9ea5d7d7e78e7f, the same commit that introduced ThrowsExn, so it seems likely that the difference here was intentional.

Last edited 3 months ago by bgamari (previous) (diff)

comment:5 Changed 3 months ago by bgamari

Differential Rev(s): Phab:D3919
Milestone: 8.4.1
Status: newpatch

I think the correct fix is Phab:D3919.

comment:6 Changed 3 months ago by bgamari

comment:7 Changed 3 months ago by Ben Gamari <ben@…>

In 10a1a47/ghc:

Model divergence of retry# as ThrowsExn, not Diverges

The demand signature of the retry# primop previously had a Diverges
result.  However, this caused the demand analyser to conclude that a
program of the shape,

    catchRetry# (... >> retry#)

would diverge. Of course, this is plainly wrong; catchRetry#'s sole
reason to exist is to "catch" the "exception" thrown by retry#. While
catchRetry#'s demand signature correctly had the ExnStr flag set on its
first argument, indicating that it should catch divergence, the logic
associated with this flag doesn't apply to Diverges results. This
resulted in #14171.

The solution here is to treat the divergence of retry# as an exception.
Namely, give it a result type of ThrowsExn rather than Diverges.

Updates stm submodule for tests.

Test Plan: Validate with T14171

Reviewers: simonpj, austin

Subscribers: rwbarton, thomie

GHC Trac Issues: #14171, #8091

Differential Revision:

comment:8 Changed 3 months ago by bgamari

Resolution: fixed
Status: patchclosed
Note: See TracTickets for help on using tickets.