"evaluate" optimized away

changed weight to 5

Actually this ticket is a bit different to #2273 (closed). Here is what is happening.

Here are the background definitions:

evaluate :: a -> IO ()
-- Defined in GHC.IO.Base
evaluate x = x `seq` return ()

poss_err :: String 
-- Assume this doesn't get constant folded away
-- So poss_err isn't definitely bottom.
-- (You had a NOINLINE on throwIfNegative)
poss_err = if (-1) < 0 then error "urk" else "foo"

assertFailure :: String -> IO ()
-- Defined in Test.HUnit.Lang
assertFailure s = throw (userError "blah")

Now consider the term

...catch (evaluate poss_err >> assertFailure "foo")...

By the time we've inlined (>>) and evaluate, we get something like this

...catch (\s. case poss_err of _ -> assertFailure "foo" s)...

Now if GHC sees (case x of y { ...blah... }), and y is used strictly in ...blah..., it discards the case expression. After all, we reason that if x turned out to throw an exception, we'll get the exception later instead. And indeed, since ...blah... diverges (well, throws a different error), y is used strictly.

It's like the case in our imprecise-exception paper where we argue that we should not specify which of the two errors is thrown by (error "urk" + error "bok"). It's imprecise.

However, this is the IO monad, where the order of evaluation is defined, and the exceptions raise are too. There are two mistakes.

In GHC.IO.Base, the function evaluate is small and hence can be inlined. That exposes the fact that the state token is not affected by evaluating the argument. If we didn't inline evaluate we'd end up with something like

  \s.  case (evaluate poss_err s) of (# s1, _ #) -> assertFailure "blah" s1

Now we can't drop the case because we need the state token s1. If we just put a {-# NOINLNE #-} on evaluate we'll hide this from GHC so it won't discard such cases.

In Test.HUnit.Lang, the function assertFailure has an IO type, but is defined using throw. It should be define using throwIO. The whole point of throwIO is that it consumes a state token, and that's what sequences it relative to earlier producers of the state token.

I'll fix evaluate, but someone else had better deal with HUnit.

Simon

It seems to me that we ought to be able to inline evaluate; if we emitted something directly like:

case (# touch s, poss_err #) of (# s1, _ #) -> assertFailure "blah" s1

where touch is some primitive that eventually becomes a no-op but enforces ordering (maybe we don’t even need it as it stands?). In this case the case wouldn't be dropped. NOINLINE works, but it seems a little wrong to me to add the overhead of a function application.

No -- GHC would just split them apart. You'd need a new primitive:

evaluate# :: o -> State# a -> State# a

That is, with the same type as touch# but it does evaluation, which touch# does not. Otherwise I can't see how to be certain that the eval will never be dropped; the only way I can see is to combine evaluation and state-token-transformation in a way that GHC can't disentangle, hence evaluate#.

I don't think this is common enough that the performance impact of an out-of-line call will be important.

Simon

assigned to @simonmar

Simon and I have just agreed that the right thing is to implement a small family of primops:

touchS# :: a -> State# b -> State# b     -- The current touch#
seqS#   :: a -> State# b -> State# b     -- Used for 'evaluate'
parS#   :: a -> State# b -> State# b     -- Use for the par monad

One rather user-visible result of this ticket is that we should now discourage the use of the idiom

x `seq` return ()

I can see a few instances of this in base

ezyang@javelin:~/Dev/ghc-master/libraries/base$ grep -R '`seq` return ()' .
./Foreign/C/String.hs:        go [] n     = n `seq` return () -- make it strict in n
./Foreign/C/String.hs:        go [] n     = n `seq` return () -- make it strict in n
./GHC/IO/Handle/Text.hs:    c `seq` return ()
./GHC/Conc/Windows.hs:  r `seq` return () -- avoid space leak

and there are probably more elsewhere.

Indeed. Every one of these should be call to evaluate; that is what evaluate is for! Ian: can you make it so?

Simon

I'm just adding Don's comment and my response from ghc-users.

Don says: that's a very common idiom. Interestingly, we have:

-- | Strict (call-by-value) application, defined in terms of 'seq'.
($!)    :: (a -> b) -> a -> b
#ifdef __GLASGOW_HASKELL__
f $! x  = let !vx = x in f vx  -- see #2273 
#elif !defined(__HUGS__) f 
$! x  = x `seq` f x
#endif

Simon's response: It's very different, I think.

evaluate is in the IO monad, and (should) guarantee to evaluate the argument before proceeding, so if evaluating the argument to WHNF diverges or throws a exception, any exceptions thrown (in the IO monad at least) after the 'evaluate' should not happen. For example:

evaluate (throw "first") >> throwIO (userError "second")

should guaranteed to throw "first" and not "second". (The current bug is the 'evaluate' doesn't meet its guarantee.)

Strict application ($!) is not in the IO monad, so it merely makes a strictness guarantee. It doesn't guarantee to make exceptions in the argument "beat" exceptions in the function. For example

((:) $! (throw "second")) $! (throw "first")

does not guaranteed to throw "first" rather than "second".

Does that distinction make sense? Perhaps the contrast is a useful one. I wonder where it might be documented?

Why is x seq return () discouraged? Why would I use evaluate if I want to make my code strict in x but don't care when it is evaluated? This seems to be what ezyang's examples are about (the ones in Foreign.C.String even say so explicitly) so I don't understand why these should be replaced with evaluate.

OK fair enough. If all you want is strictness, then seq should be fine. If you want exception ordering, or space-leak squashing, then evaluate is better.

I would like to understand this tradeoff better. I would like to be able to say "use evaluate whenever the IO monad is available" but it seems you might preclude some optimizations with this advice.

Right, it's quite common to use

   do ...
      return $! e

to avoid the thunk that would otherwise be created for e. My guess is that in most cases we don't care about the order of evaluation in these cases, so $! is fine.

It also occurs to me that the simplifier won't know how to optimise seqS# unless we teach it. For example, we want to eliminate seqS# when applied to a value.

See also Note [Desugaring seq] (1) and (2) in DsUtils.

mentioned in issue #5262

mentioned in commit be544179

mentioned in commit 196785e1

Trac field	Value
Version	7.0.3
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture

"evaluate" optimized away

Child items ...

Activity

"evaluate" optimized away

Relates to

Activity