Opened 7 years ago

Closed 7 years ago

Last modified 4 years ago

#986 closed merge (fixed)

SMP race condition in getContents

Reported by: int-e Owned by: igloo
Priority: normal Milestone: 6.6.1
Component: Runtime System Version: 6.6
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Difficulty: Moderate (less than a day)
Test Case: conc068 Blocked By:
Blocking: Related Tickets:

Description

This is a problem mentioned in the Haskell on SMP paper in section 3.5. If a thunk contains an unsafePerformIO or was created by unsafeInterleaveIO, evaluating it multiple times is harmful. The code below triggers this problem.

import Data.List
import Data.Char
import Control.Parallel
import System.IO

{-# NOINLINE fool #-}
-- 'fool' just makes sure CSE doesn't meddle with the code below
fool :: [a] -> [a]
fool (x:xs) = xs

main = do
    hSetBuffering stdin NoBuffering
    hs <- getContents
    let -- create copies of the input
        ts = map (:hs) ['a'..'z']
        -- sum the characters of each copy
        qs = map (foldl' (+) 0 . map ord . fool) ts
        -- in parallel
        rs = foldr (\x y -> x `par` y `par` (x:y)) [] qs
    -- compare the results and print 'True' if they are all equal.
    -- This should never print 'False'
    print $ all (uncurry (==)) $ zip rs (tail rs)

Don Steward kindly tried it on an SMP machine with the following results:

<dons> $ ghc -O A.hs -threaded
<dons> /tmp/ghc5400_0/ghc5400_0.hc:275:0:
<dons>      warning: implicit declaration of function 'newSpark'
<dons> -- good sign I got the right rts
<dons> $ yes abqszzzq | head -c 100000 | ./a.out +RTS -N1
<dons> True
<dons> $ yes abqszzzq | head -c 100000 | ./a.out +RTS -N2
<dons> False
<dons> $ uname -msr
<dons> Linux 2.6.15.3-general i686
<dons> with SMP kernel
<dons> CPU7: Intel(R) Xeon(TM) CPU 2.80GHz stepping 08
<dons> Total of 8 processors activated (44807.44 BogoMIPS).

To solve this problem, we need something like the justOnce suggested in the paper. It should be used in unsafePerformIO and unsafeInterleaveIO. Non-locking versions could be provided as unlockedUnsafePerformIO and unlockedInterleaveIO. This would allow existing libraries to work on SMP without or with just minor modifications, as far as I can see. The assumption that a thunk is evaluated at most once is fairly widespread.

Change History (11)

comment:1 Changed 7 years ago by int-e

  • Component changed from Compiler to Runtime System

comment:2 Changed 7 years ago by igloo

  • Milestone set to 6.6.1

comment:3 Changed 7 years ago by simonmar

  • Architecture changed from x86 to Multiple
  • Difficulty changed from Unknown to Moderate (1 day)
  • Operating System changed from Linux to Multiple
  • Owner set to simonmar

I'll take this one

comment:4 Changed 7 years ago by simonmar

  • Owner changed from simonmar to igloo
  • Type changed from bug to merge

I've committed some fixes to the HEAD for this, let's let them settle a while before merging. The patches are:

Tue Mar  6 14:31:12 GMT 2007  Simon Marlow <simonmar@microsoft.com>
  * add noDuplicate#

Tue Mar  6 14:27:32 GMT 2007  Simon Marlow <simonmar@microsoft.com>
  * THREADED_RTS: use cas() when claiming thunks

and for libraries/base:

Tue Mar  6 14:54:24 GMT 2007  Simon Marlow <simonmar@microsoft.com>
  * Prevent duplication of unsafePerformIO on a multiprocessor

and for the testsuite:

Tue Mar  6 15:54:50 GMT 2007  Simon Marlow <simonmar@microsoft.com>
  * add test for #986

comment:5 Changed 7 years ago by simonmar

One more patch to merge:

Wed Mar  7 00:56:48 PST 2007  Simon Marlow <simonmar@microsoft.com>
  * add declaration for noDuplicatezh_fast

comment:6 Changed 7 years ago by simonmar

One more to merge:

Thu Mar  8 01:57:17 PST 2007  Simon Marlow <simonmar@microsoft.com>
  * add noDuplicatezh_fast to symbol table

comment:7 Changed 7 years ago by igloo

  • Test Case set to conc068

comment:8 Changed 7 years ago by igloo

  • Resolution set to fixed
  • Status changed from new to closed

All merged.

comment:9 Changed 6 years ago by simonmar

  • Architecture changed from Multiple to Unknown/Multiple

comment:10 Changed 6 years ago by simonmar

  • Operating System changed from Multiple to Unknown/Multiple

comment:11 Changed 4 years ago by simonmar

  • Difficulty changed from Moderate (1 day) to Moderate (less than a day)
Note: See TracTickets for help on using tickets.