Opened 3 years ago

Closed 2 years ago

Last modified 23 months ago

#10414 closed bug (fixed)

Buggy behavior with threaded runtime (-N1 working, -N2 getting into <<loop>>)

Reported by: exio4 Owned by:
Priority: normal Milestone: 8.0.1
Component: Compiler Version: 7.10.1
Keywords: Cc: simonmar
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Incorrect result at runtime Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D1040
Wiki Page:

Description (last modified by rwbarton)

Compiling the test case with:

ghc -O2 -threaded -eventlog -rtsopts ghc-bug.hs

Now, trying with some inputs and -N2

$ ./ghc-bug 7 +RTS -N2 => ghc-bug: <<loop>> $ ./ghc-bug 6 +RTS -N2 => ghc-bug: <<loop>> $ ./ghc-bug 5 +RTS -N2 => 3125 $ ./ghc-bug 5 +RTS -N2 ghc-bug: <<loop>>

Reducing the number of capabilities to 1, it works for those inputs

$ ./ghc-bug 7 +RTS -N1

As a side-note, the problem only happens randomly with small inputs (on my hardware), and it seems to go away with bigger inputs (the original testcase felt a bit more deterministic, but I think the testcase in the ticket is good enough)

I only tested this with GHC 7.8.4 (on Debian), but people on IRC reported the same behavior with GHC 7.10.1 on OS X and Debian

Similar bug: #10218 (-fno-cse and -flate-dmd-anal didn't help with this)

import           Control.Applicative
import           Control.Monad

import           Control.Parallel.Strategies

import           System.Environment
    
newtype ParList a = ParList { unParList :: [a] }

nil :: ParList a
nil = ParList []
cons :: a -> ParList a -> ParList a
cons x (ParList xs) = ParList (x:xs)

instance Functor ParList where
    fmap = liftM

instance Applicative ParList where
    pure = return
    (<*>) = ap

instance Monad ParList where
    return = ParList . return
    {- v code that doesn't work -}
    (ParList xs) >>= f = ParList (withStrategy (parListChunk 8 rseq) (xs >>= unParList . f))
    --(ParList xs) >>= f = ParList (concat (parMap rseq (unParList . f) xs))
    {- ^ code that works -}
    
type Pair = (Int, [Int])

loop' :: Pair -> ParList Pair 
loop' (size,qns) = go 1
    where go n | n > size  = nil
               | otherwise = cons (size, n:qns) (go (n+1))
          
worker :: Int -> Pair -> [Pair]
worker n = unParList . go n
    where go 1 = loop'
          go n = loop' >=> go (n-1)
          
main :: IO ()
main = do
    [n] <- (read <$>) <$> getArgs
    print $ length (worker n (n,[]))

Attachments (3)

par.hs (3.1 KB) - added by rwbarton 2 years ago.
single-module reproducer
par2.hs (1.5 KB) - added by michaelt 2 years ago.
reduced version of rbarton's par.hs
par2.log (44.1 KB) - added by rwbarton 2 years ago.
set -o pipefail; while ./par2 +RTS -N2 -Ds 2>&1 >/dev/null | ts -s "%.s" > par2.log; do :; done

Download all attachments as: .zip

Change History (35)

comment:1 Changed 3 years ago by rwbarton

Description: modified (diff)

comment:2 Changed 3 years ago by rwbarton

I get the same behavior with HEAD.

comment:3 Changed 3 years ago by AlexET

Compiling with -debug I sometimes get

BrokenCode: internal error: ASSERTION FAILED: file rts/Schedule.c, line 570

comment:4 Changed 3 years ago by michaelt

I simplified the problem a little. The first command line argument is for the chunk size given to parListChunk, which seems to matter. I get the same result with ghc-7.8.3 and ghc-7.10.1 on os x

import           Control.Parallel.Strategies
import           System.Environment

type Pair = (Int, [Int])

loop' :: Pair ->  [Pair] 
loop' (size,qns) = go 1
    where go n | n > size  = []
               | otherwise =  (size, n:qns) : (go (n+1))

worker :: Int -> Int -> Pair -> [Pair]
worker chunksize =  go 
    where go 1  = loop' 
          go n  = withStrategy (parListChunk chunksize rseq) 
                 . concatMap (go (n-1)) 
                 . loop' 

main :: IO ()
main = do
    chunksize:n:_ <- fmap (map read) getArgs
    print $ length (worker chunksize n (n,[]))

With this i get, e.g.:

$ ./threads 3 7 +RTS -N2
threads: <<loop>>

$ ./threads 2 7 +RTS -N2  --reduce chunk size to 2
823543

$ ./threads 2 7 +RTS -N1 -- use N1 instead
823543

$ ./threads 3 7 +RTS -N1
823543

Last edited 3 years ago by michaelt (previous) (diff)

comment:5 Changed 3 years ago by michaelt

This came into my head again. It can be simplified, and I think clarified, like so:

import           Control.Parallel.Strategies
import           System.Environment

parConcatMapN :: Int -> Int -> (a -> [a]) -> a -> [a]
parConcatMapN chunksize depth step =  go depth 
    where go 0  = (:[])
          go n  = withStrategy (parListChunk chunksize rseq) 
                 . concatMap (go (n-1)) 
                 . step

main :: IO ()
main = do
    depth:_ <- fmap (map read) getArgs
    print $ length (test depth)

test depth = parConcatMapN 3 depth show 'x'  
-- i.e.  iterate (concatMap show) 'x' !! depth

We keep re-chunking, but do not respect the chunking we already did, in the would-be progress through

'x'
'\'''x''\''
'\'''\\''\'''\'''\'''x''\'''\'''\\''\'''\''
...

Last edited 3 years ago by michaelt (previous) (diff)

comment:6 Changed 3 years ago by yongqli

We've run into what seems to be the same issue. We've isolated a test case here: https://gist.github.com/j-e-k/bf8c5f027178ee20ac94

Could we get this fixed in 7.10.2?

comment:7 Changed 2 years ago by simonpj

I'd love it to be fixed in 7.10.2, but first someone needs to volunteer to look into what is wrong, and hopefully find out how to fix it. Is it a bug in a library? In the RTS? In the compiler?

Help needed!

Simon

Changed 2 years ago by rwbarton

Attachment: par.hs added

single-module reproducer

comment:8 Changed 2 years ago by rwbarton

I inlined the relevant parts of the parallel package into michaelt's example for the sake of testing across multiple versions of GHC. In the process I discovered that building with -feager-blackholing is necessary to reproduce the bug. Well, not surprising. (The parallel package specifies this flag in its cabal file.) To be specific, I am building it as

ghc -threaded -O -rtsopts par -fforce-recomp -feager-blackholing

and running as

while ./par 8 +RTS -N4; do :; done

until it fails (which is usually immediately).

I managed to bisect the failure down to these three commits between 7.6 and 7.8 which added cardinality analysis: https://github.com/ghc/ghc/compare/da4ff650ae77930a5a10d4886c8bc7d37f081db7...62653122f3cf2d48a475cadecc9b4483488c9769

Interestingly in the versions that have cardinality analysis, the program still loops even when built with -fkill-absence.

Hopefully this provides some clues to someone...

comment:9 Changed 2 years ago by simonpj

That's extremely helpful, thank you Reid.

I'm guessing that the culprit, somehow, is this code in CoreToStg:

    upd_flag | isSingleUsed (idDemandInfo bndr)  = SingleEntry
             | otherwise                         = Updatable

This arranges not to black-hole a thunk that is used at most once. Can you try with -fkill-one-shot? That switches off this optimisation, and I bet it'll make the program work.

Assuming that does fix it, the next question is: is it just the CoreToStg upd_flag? (-fkill-one-shot affects more things than just that one spot.) The next thing I'd do would be just to comment out the SingleEntry case above, rebuild the compiler, and check that the bug is gone. I'm 95% sure that it'll will, but worth checking.

Next. If some thunk is being marked SingleEntry, and that that is causing the bug:

  • Which thunk is it?
  • Is it really single-entry? (I.e. is the analysis right or not)

If you -ddump-stg and look for thunks marked \s, those are the single-entry ones. There probably aren't very many. If it's very few, we could look at the "is the analysis right?" question. If it's many, we'll need to find a way to narrow it down somehow.

Simon

comment:10 Changed 2 years ago by simonpj

BTW is is the overloading of Traversable important? Eliminate it from the single-module reproducer?

Last edited 2 years ago by simonpj (previous) (diff)

comment:11 Changed 2 years ago by simonmar

Lazy blackholing will still take place for thunks that are not blackholed by eager blackholing, because we have no way to distinguish between an eager-blackholed and a lazy-blackholed thunk in the runtime. We had bugs in this area in the past, see #5226. I'm not sure this helps, but it's possible that the cardinality analysis is correct and this is a runtime bug.

comment:12 Changed 2 years ago by rwbarton

Unfortunately -fkill-one-shot made no difference.

My gut feeling is that this is a bug in the RTS that was uncovered (for this program) by the cardinality analysis patch, but I have no real evidence for this.

comment:13 Changed 2 years ago by michaelt

I reduced this a little further so that it just uses the Monad instance and the two basic combinators:

rparWith s a = Eval $ \s0 -> spark# r s0
  where r = case s a of  Eval f -> case f realWorld# of  (# _, a' #) -> a'

runEval :: Eval a -> a
runEval (Eval x) = case x realWorld# of (# _, a #) -> a

and a non-recursive but monotonously layered concrete function.

comment:14 Changed 2 years ago by michaelt

The mechanism for attaching source must be before my eyes, but here is the reduced module:

{-# LANGUAGE MagicHash, UnboxedTuples #-}
import Control.Applicative
import Control.Monad
import GHC.Exts

newtype Eval a = Eval (State# RealWorld -> (# State# RealWorld, a #))

instance Functor Eval where fmap = liftM

instance Applicative Eval where  (<*>) = ap; pure  = return

instance Monad Eval where
  return x = Eval $ \s -> (# s, x #)
  Eval x >>= k = Eval $ \s -> case x s of
                                (# s', a #) -> case k a of
                                                      Eval f -> f s'

rparWith s a = Eval $ \s0 -> spark# r s0
  where r = case s a of  Eval f -> case f realWorld# of  (# _, a' #) -> a'


runEval :: Eval a -> a
runEval (Eval x) = case x realWorld# of (# _, a #) -> a


main :: IO ()
main = do -- print $ length (pf 'x') -- either statement works at least on and off
          print (program 'y')   -- but I seem to lose the effect if I use both statements

program = 
  pchunk . concatMap (pchunk . concatMap (pchunk . concatMap (pchunk . show) . show) . show) . show
  where
  -- the effect seems to vanish if I eta expand pchunk
  pchunk  = runEval 
         . fmap concat 
         .  mapM (rparWith (mapM (\x -> Eval $ \s -> seq# x s) )) 
         . chunk' 

  -- the effect seems to disappear if I reject splitAt in favor
  -- of a pattern match chunk' (a:b:c:xs) = [a,b,c]: chunk' xs
  chunk' ::  [a] -> [[a]]
  chunk' [] = []
  chunk' xs =  as : chunk' bs where (as,bs) = splitAt 3 xs


comment:15 Changed 2 years ago by michaelt

And a bit more compressed, for what it may be worth:

{-# LANGUAGE MagicHash, UnboxedTuples #-}
import GHC.Exts
newtype Eval a = Eval {runEval :: State# RealWorld -> (# State# RealWorld, a #)}

-- inline sequence ::  [Eval a] -> Eval [a]
well_sequenced ::  [Eval a] -> Eval [a]
well_sequenced = foldr op (Eval $ \s -> (# s, [] #))  where
  op e es = Eval $ \s -> case runEval e s of
                    (# s', a #) -> case runEval es s' of
                      (# s'', as #) -> (# s'', a : as #)

-- seemingly demonic use of spark#
ill_sequenced ::  [Eval a] -> Eval [a]
ill_sequenced  as = Eval $ spark# (case well_sequenced as of 
             Eval f -> case f realWorld# of  (# _, a' #) -> a')

main :: IO ()
main = print ((layer . layer . layer . layer . layer) show 'y')   
  where
  layer :: (Char -> String) -> (Char -> String)
  layer f = (\(Eval x) -> case x realWorld# of (# _, as #) -> concat as) 
        . well_sequenced
        . map ill_sequenced -- all is well with well_seqenced
        . map (map (\x -> Eval $ \s -> (# s, x #))) 
        . chunk'  
        . concatMap f 
        . show

This seems pretty reliably bad. I have been using

ghc -threaded -O -rtsopts par.hs -fforce-recomp -feager-blackholing
./par +RTS -N2
Last edited 2 years ago by michaelt (previous) (diff)

Changed 2 years ago by michaelt

Attachment: par2.hs added

reduced version of rbarton's par.hs

comment:16 Changed 2 years ago by michaelt

Sorry, I'm spamming the trac a bit. Notice that in the ultra-simplified module, now attached properly, the wrapping with Lift that parallel uses for rparWith is no where to be found. If I wrap stuff in my ill_sequenced with Lift, I can't get the effect. If though, that use of Lift in the definition of rparWith is required by whatever is going on with spark# and some of these other opaque-to-me primitives, then there is a question whether it is used enough: the original program is doing an end-run around this. It is presumably obviously undesirable, but if in rbarton's par.hs I complicate the definition of rpar , which is

rpar :: a -> Eval a
rpar  x = Eval $ \s -> spark# x s

and use instead something like

rpar :: a -> Eval a
rpar x = Eval $ \s -> case y of
   Eval f -> case f s of 
     (# s1 , l #) -> case l of Lift w -> (# s1 , w #)
  where y = Eval $ \s -> spark# (Lift x) s

then it seems all is well again. That probably destroys all the desired effects; but if it doesn't, then the problem may just be that the library is letting the user get too close to spark# which is practically naked in rpar.

comment:17 Changed 2 years ago by rwbarton

I took a look at the generated STG for par2.hs before and after the cardinality analysis commit. Before, there are no \s thunks at all. After, there are two. One is in

Main.main_go [Occ=LoopBreaker]
  :: [[GHC.Types.Char]] -> [GHC.Types.Char]
[GblId,
 Arity=1,
 Caf=NoCafRefs,
 Str=DmdType <S,U>,
 Unf=OtherCon []] =
    \r srt:SRT:[] [ds_s1nu]
        case ds_s1nu of _ {
          [] -> [] [];
          : y_s1ny [Occ=Once] ys_s1nz [Occ=Once] ->
              let {
                sat_s1pj [Occ=Once, Dmd=<L,1*U>] :: [GHC.Types.Char]
                [LclId, Str=DmdType] =
                    \s srt:SRT:[] [] Main.main_go ys_s1nz;
              } in  GHC.Base.++ y_s1ny sat_s1pj;
        };

This comes from the Core

Main.main_go [Occ=LoopBreaker]
  :: [[GHC.Types.Char]] -> [GHC.Types.Char]
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType <S,U>]
Main.main_go =
  \ (ds_XrA :: [[GHC.Types.Char]]) ->
    case ds_XrA of _ {
      [] -> GHC.Types.[] @ GHC.Types.Char;
      : y_arg ys_arh ->
        GHC.Base.++ @ GHC.Types.Char y_arg (Main.main_go ys_arh)
    }

which presumably comes from the use of concat in layer. This happens even when I build the program with -fkill-absence -fkill-one-shot. Could that be because base was built with cardinality analysis enabled? I don't entirely see how, but I can try rebuilding the libraries with -fkill-absence -fkill-one-shot.

Anyways I guess the main question, which I'm not sure how to answer, is whether the fact that this thunk is marked as single-entry is correct.

The other single-entry thunk is, I think, very similar and arises from concatMap:

{- note
$wlayer_r1m6
  :: (GHC.Types.Char -> GHC.Base.String)
     -> GHC.Prim.Char# -> GHC.Base.String
[GblId, Caf=NoCafRefs, Str=DmdType, Unf=OtherCon []]
w3_r1ma :: GHC.Types.Char -> GHC.Base.String
[GblId, Arity=1, Str=DmdType, Unf=OtherCon []]
-}
Main.main_go2 [Occ=LoopBreaker]
  :: [GHC.Types.Char] -> [GHC.Types.Char]
[GblId, Arity=1, Str=DmdType <S,1*U>, Unf=OtherCon []] =
    \r srt:SRT:[(r8, Main.main_go2), (r1m6, $wlayer_r1m6),
                (r1ma, w3_r1ma)] [ds_s1oh]
        case ds_s1oh of _ {
          [] -> [] [];
          : y_s1ol [Occ=Once!] ys_s1oq [Occ=Once] ->
              case y_s1ol of _ {
                GHC.Types.C# ww1_s1oo [Occ=Once] ->
                    let {
                      sat_s1pv [Occ=Once, Dmd=<L,1*U>] :: [GHC.Types.Char]
                      [LclId, Str=DmdType] =
                          \s srt:SRT:[(r8, Main.main_go2)] [] Main.main_go2 ys_s1oq;
                    } in 
                      case $wlayer_r1m6 w3_r1ma ww1_s1oo of sat_s1pu {
                        __DEFAULT -> GHC.Base.++ sat_s1pu sat_s1pv;
                      };
              };
        };

I'm going to see what happens when I inline the definitions of concat and concatMap into this module.

comment:18 Changed 2 years ago by rwbarton

Inlining concat and concatMap made no difference, the program still loops and a \s thunk is still generated inside "main_mygo". This happens for any combination of the -fkill-absence and -fkill-one-shot flags, and in every version I tested (the 7.7 commit adding cardinality analysis, 7.8.4, 7.10.1 and HEAD).

I then tried removing the SingleEntry case as Simon suggested in comment:9 and that did generate a \u thunk instead and the program no longer <<loop>>s.

So, one conclusion is that -fkill-absence/-fkill-one-shot don't fully disable cardinality analysis like they are expected to.

comment:19 Changed 2 years ago by simonpj

Michael, Reid, that is super-helpful. The cardinality analysis is plain wrong, so now we know just what is happening. I'm working on a fix.

comment:20 Changed 2 years ago by simonpj

Drat. On further reflection I the cardinality analysis is correct. So it looks as though there is a bug in the runtime system.

Simon M: might you find time to investigate? With only two single-entry thunks it can't be that hard!

I have literally no hypothesis for why a single-entry thunk could cause <loop>.

Simon

comment:21 Changed 2 years ago by rwbarton

I think I've finally come up with at least part of a plausible explanation. Tell me if this seems right...

Suppose I have a heap object whose initial value is

  x = \u []  concat [[1],[]]   -- using "\u []" as syntax for "updatable thunk",
                               -- otherwise normal Core syntax (not STG)

and suppose first some thread evaluates x to WHNF. Using the definition of concat as main_go from above, this will allocate a single-entry thunk y = \s [] concat [], and then calling (++) will lead to

  x = 1 : z
  y = \s []  concat []
  z = \u []  (++) [] y

At this point the heap has the following property

(*) The heap contains a single-entry thunk (y) and a regular thunk (z) such that entering the regular thunk will cause the single-entry thunk to be entered as well.

(The key point is that the single-entry thunk has already been allocated on the heap, in contrast to a situation in which entering a regular thunk would cause a new single-entry thunk to be allocated, possibly entered, and then become garbage, all before evaluation of that regular thunk is complete.)

Now, consider the following execution path of two threads A and B.

  • Both threads enter z simultaneously (before either manages to overwrite it with a black hole, if eager blackholing was enabled when we compiled (++); otherwise before either manages to update it to an indirection).
  • Thread A does the case analysis inside (++) and enters y, and overwrites it with a black hole before thread B does anything else.
  • Now thread B does the case analysis inside (++) and enters y, but y has been overwritten with a black hole so thread B blocks. But y is never going to be updated, so thread B will block forever. This is bad!
  • Finally thread A finishes evaluating y (to []) and then updates z accordingly. But thread B is still blocking on the black hole and even if it could get unblocked by some mechanism (say in the GC) there's no longer enough information on the heap to recover the correct value of y and allow thread B to continue.

This doesn't exactly explain why the programs in this ticket <<loop>>, but a thread becoming permanently blocked is equally bad behavior I think.

Some extra evidence that something like this is going on is that if I inline the definition of (++) into par2.hs as well, so that it is compiled with eager blackholing enabled, the program <<loop>>s much less frequently, just a few percent of the time as opposed to over half the time. That would match up with a smaller window of simultaneity in the first step of the execution trace above.

If this analysis is correct, then assuming we want to continue to allow threads to enter thunks in an unsynchronized way (to avoid prohibitive locking costs), it seems we have to ensure that the condition (*) never holds, at least when eager blackholing is enabled. Generating single-entry thunks is still okay as long as they never survive as live heap objects after the thunk that allocated them has been reduced to WHNF.

comment:22 Changed 2 years ago by rwbarton

I also came across this comment in rts/ThreadPaused.c (I think it is concerning a different scenario):

    // NB. Blackholing is *compulsory*, we must either do lazy
    // blackholing, or eager blackholing consistently.  See Note
    // [upd-black-hole] in sm/Scav.c.

Does this mean that every module in the entire program, including all the libraries the program links against like base, needs to be compiled with the same presence or absence of -feager-blackholing? The User's Guide doesn't mention anything about this and if that is the case, the flag seems essentially unusable except for those who are building their own GHC from source. I'm hoping the comment is just worded unclearly.

comment:23 Changed 2 years ago by simonpj

We need Simon Marlow's advice here.

What you say does seem plausible:

  • With lazy blackholing, a single-entry thunk will never be black-holed. Lazy black-holing only black-holes thunks with an update frame on the stack, and single-entry thunks do not push an update frame. So that could explain why -feager-blackholing is required.
  • Rather to my surprise, the code generator emits code to black-hole even a single-entry thunk. I can't see any good reason why this happens. Look at StgCmmClosure.blackHoleOnEntry.
  • Re comment:22 we need Simon to say. It would be pretty bad if every module had to be compiled consistently. However I note that the comment in Scav.c is in code that messes with update frames, and so single-entry thunks probably don't matter.

Even if all this is right, we still don't know why we get <<loop>>. I'm assuming that this means we have entered a black hole, that is under evaluation by this thread; but suddenly I'm not sure. (For a black hole being evaluated by another thread, we'd block.)

I suggest you try making blackHoleOnEntry return False for single-entry thunks.

Last edited 2 years ago by simonpj (previous) (diff)

comment:24 Changed 2 years ago by simonmar

Thanks for all the awesome debugging work here, I think we finally have all the pieces!

@rwbarton's analysis seems plausible to me too. A single-entry thunk can still be entered multiple times in a parallel setting, and if eager blackholing is on then it is possible that a thread can get blocked indefinitely.

Threads blocked indefinitely are detected by the GC as deadlocked, and receive an exception, which results in the <loop> message.

The fix should be simple: just don't do eager blackholing for single-entry thunks.

Regarding this comment:

    // NB. Blackholing is *compulsory*, we must either do lazy
    // blackholing, or eager blackholing consistently.  See Note
    // [upd-black-hole] in sm/Scav.c.

It means every update frame must refer to a black hole by the time the GC runs, it's an invariant the GC relies on. It shouldn't be a problem here because it only applies to update frames. I can reword the comment so it's clearer.

Changed 2 years ago by rwbarton

Attachment: par2.log added

set -o pipefail; while ./par2 +RTS -N2 -Ds 2>&1 >/dev/null | ts -s "%.s" > par2.log; do :; done

comment:25 Changed 2 years ago by rwbarton

The actual cause of the <<loop>> here seems to be that two threads are each blocking on a black hole that is being evaluated, or more likely has been evaluated but not updated, by the other thread. I attached a complete -Ds log above, but the relevant lines are

...
0.011574 7ff6be7fc700: cap 1: thread 6 stopped (blocked on black hole owned by thread 5)
...
0.011808 7ff6c6700740: cap 0: thread 5 stopped (blocked on black hole owned by thread 6)
...

I didn't work out exactly how this can arise, but it probably involves two single-entry thunks and two ordinary thunks whose evaluations force both of the single-entry thunks, but in different orders.

Changing blackHoleOnEntry for single-entry thunks as suggested did fix par2.hs. I'm going to test the other examples in this ticket now.

comment:26 Changed 2 years ago by michaelt

rwbarton, I think I have tested all of these examples on my machine, for it's worth.

I rebuilt HEAD with what I took to be the patch described above:

-     LFThunk _ _no_fvs _updatable _ _ -> True
+     LFThunk _ _no_fvs _updatable _ _ -> _updatable 

for https://github.com/ghc/ghc/blob/master/compiler/codeGen/StgCmmClosure.hs#L769 .

Everything works fine, or seems to work fine.

The complex program yongqli linked, with all the fancy imports, was a little erratic; I increased the size of the csv a lot, to make it more reliably bad somewhere along the way. I then just used the scheme of running ./bugcsv +RTS -N_ | grep loop 500 times each with -N5 and -N3 . (-N2 does't seem to make bring out the pathology with this program.) With ghc-7.10.1 I got

bugcsv: <<loop>>

about 200 times either way, but with the patched head, blessed silence.

comment:27 Changed 2 years ago by rwbarton

Differential Rev(s): Phab:D10414
Status: newpatch

comment:28 Changed 2 years ago by bgamari

Differential Rev(s): Phab:D10414Phab:D1040

comment:29 Changed 2 years ago by Ben Gamari <ben@…>

In aaa0cd20fdaf8e923e3a083befc2612154cba629/ghc:

Don't eagerly blackhole single-entry thunks (#10414)

In a parallel program they can actually be entered more than once,
leading to deadlock.

Reviewers: austin, simonmar

Subscribers: michaelt, thomie, bgamari

Differential Revision: https://phabricator.haskell.org/D1040

GHC Trac Issues: #10414

comment:30 Changed 2 years ago by Ben Gamari <ben@…>

In d27e7fdb1f16ebb28fee007fc0b1dfbd761789d7/ghc:

Add more discussion of black-holing logic for #10414

Signed-off-by: Ben Gamari <ben@smart-cactus.org>

comment:31 Changed 2 years ago by bgamari

Resolution: fixed
Status: patchclosed

Merged.

comment:32 Changed 23 months ago by thomie

Milestone: 8.0.1
Note: See TracTickets for help on using tickets.