GHC: Ticket #5129: "evaluate" optimized away
http://ghc.haskell.org/trac/ghc/ticket/5129
<p>
Reported on Stackoverflow: <a class="ext-link" href="http://stackoverflow.com/questions/5697159/testing-for-error-calls-in-hunit"><span class="icon"></span>http://stackoverflow.com/questions/5697159/testing-for-error-calls-in-hunit</a>
</p>
<p>
With optimizations on, the following program, which normally succeeds (correctly generating an exception), instead fails, and the exception is optimized away.
</p>
<pre class="wiki">import Control.Exception
import Test.HUnit
throwIfNegative :: Int -> String
throwIfNegative n | n < 0 = error "negative"
| otherwise = "no worries"
{-# NOINLINE throwIfNegative #-}
case_negative =
handleJust errorCalls (const $ return ()) $ do
evaluate $ throwIfNegative (-1)
assertFailure "must throw when given a negative number"
where errorCalls (ErrorCall _) = Just ()
main = runTestTT $ TestCase case_negative
</pre><p>
Looking at the core, after a few iterations, the call to <tt>throwIfNegative</tt> is dropped as dead code.
</p>
<p>
Using seq instead of evaluate is happy enough:
</p>
<pre class="wiki">case_negative =
handleJust errorCalls (const $ return ()) $ do
throwIfNegative (-1) `seq` assertFailure "must throw when given a negative number"
where errorCalls (ErrorCall _) = Just ()
</pre><p>
or a bang pattern:
</p>
<pre class="wiki">case_negative =
handleJust errorCalls (const $ return ()) $ do
let !x = throwIfNegative (-1)
assertFailure "must throw when given a negative number"
where errorCalls (ErrorCall _) = Just ()
</pre><p>
Possibly related to <a class="new ticket" href="http://ghc.haskell.org/trac/ghc/ticket/2273" title="bug: inlining defeats seq (new)">#2273</a>
</p>
en-usGHChttp://ghc.haskell.org/trac/ghc/chrome/site/ghc_logo.png
http://ghc.haskell.org/trac/ghc/ticket/5129
Trac 1.0.1ezyangTue, 19 Apr 2011 21:47:44 GMTcc set
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:1
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:1
<ul>
<li><strong>cc</strong>
<em>ezyang@…</em> added
</li>
</ul>
TicketsimonpjSat, 30 Apr 2011 15:53:45 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:2
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:2
<p>
Actually this ticket is a bit different to <a class="new ticket" href="http://ghc.haskell.org/trac/ghc/ticket/2273" title="bug: inlining defeats seq (new)">#2273</a>. Here is what is happening.
</p>
<p>
Here are the background definitions:
</p>
<pre class="wiki">evaluate :: a -> IO ()
-- Defined in GHC.IO.Base
evaluate x = x `seq` return ()
poss_err :: String
-- Assume this doesn't get constant folded away
-- So poss_err isn't definitely bottom.
-- (You had a NOINLINE on throwIfNegative)
poss_err = if (-1) < 0 then error "urk" else "foo"
assertFailure :: String -> IO ()
-- Defined in Test.HUnit.Lang
assertFailure s = throw (userError "blah")
</pre><p>
Now consider the term
</p>
<pre class="wiki">...catch (evaluate poss_err >> assertFailure "foo")...
</pre><p>
By the time we've inlined <tt>(>>)</tt> and <tt>evaluate</tt>, we get something like this
</p>
<pre class="wiki">...catch (\s. case poss_err of _ -> assertFailure "foo" s)...
</pre><p>
Now if GHC sees <tt>(case x of y { ...blah... })</tt>, and <tt>y</tt> is used strictly in <tt>...blah...</tt>, it discards the case expression. After all, we reason that if <tt>x</tt> turned out to throw an exception, we'll get the exception later instead. And indeed, since <tt>...blah...</tt> diverges (well, throws a different error), <tt>y</tt> is used strictly.
</p>
<p>
It's like the case in our imprecise-exception paper where we argue that we should not specify which of the two errors is thrown by <tt>(error "urk" + error "bok")</tt>. It's imprecise.
</p>
<p>
However, this is the IO monad, where the order of evaluation <em>is</em> defined, and the exceptions raise are too. There are two mistakes.
</p>
<ul><li>In <tt>GHC.IO.Base</tt>, the function <tt>evaluate</tt> is small and hence can be inlined. That exposes the fact that the state token is not affected by evaluating the argument. If we didn't inline <tt>evaluate</tt> we'd end up with something like
<pre class="wiki"> \s. case (evaluate poss_err s) of (# s1, _ #) -> assertFailure "blah" s1
</pre>Now we can't drop the case because we need the state token <tt>s1</tt>. If we just put a <tt>{-# NOINLNE #-}</tt> on <tt>evaluate</tt> we'll hide this from GHC so it won't discard such cases.
</li></ul><ul><li>In <tt>Test.HUnit.Lang</tt>, the function <tt>assertFailure</tt> has an IO type, but is defined using <tt>throw</tt>. It should be define using <tt>throwIO</tt>. The whole point of <tt>throwIO</tt> is that it consumes a state token, and that's what sequences it relative to earlier producers of the state token.
</li></ul><p>
I'll fix <tt>evaluate</tt>, but someone else had better deal with <tt>HUnit</tt>.
</p>
<p>
Simon
</p>
TicketezyangSat, 30 Apr 2011 16:06:50 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:3
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:3
<p>
It seems to me that we ought to be able to inline evaluate; if we emitted something directly like:
</p>
<pre class="wiki">case (# touch s, poss_err #) of (# s1, _ #) -> assertFailure "blah" s1
</pre><p>
where <tt>touch</tt> is some primitive that eventually becomes a no-op but enforces ordering (maybe we don’t even need it as it stands?). In this case the case wouldn't be dropped. NOINLINE works, but it seems a little wrong to me to add the overhead of a function application.
</p>
TicketsimonpjMon, 02 May 2011 08:51:01 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:4
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:4
<p>
No -- GHC would just split them apart. You'd need a new primitive:
</p>
<pre class="wiki">evaluate# :: o -> State# a -> State# a
</pre><p>
That is, with the same type as <tt>touch#</tt> but it does evaluation, which <tt>touch#</tt> does not. Otherwise I can't see how to be certain that the eval will never be dropped; the only way I can see is to combine evaluation and state-token-transformation in a way that GHC can't disentangle, hence <tt>evaluate#</tt>.
</p>
<p>
I don't think this is common enough that the performance impact of an out-of-line call will be important.
</p>
<p>
Simon
</p>
TicketsimonpjTue, 03 May 2011 08:30:24 GMTowner set
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:5
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:5
<ul>
<li><strong>owner</strong>
set to <em>simonmar</em>
</li>
</ul>
<p>
Simon and I have just agreed that the right thing is to implement a small family of primops:
</p>
<pre class="wiki">touchS# :: a -> State# b -> State# b -- The current touch#
seqS# :: a -> State# b -> State# b -- Used for 'evaluate'
parS# :: a -> State# b -> State# b -- Use for the par monad
</pre>
TicketezyangWed, 04 May 2011 00:13:04 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:6
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:6
<p>
One rather user-visible result of this ticket is that we should now discourage the use of the idiom
</p>
<pre class="wiki">x `seq` return ()
</pre><p>
I can see a few instances of this in <tt>base</tt>
</p>
<pre class="wiki">ezyang@javelin:~/Dev/ghc-master/libraries/base$ grep -R '`seq` return ()' .
./Foreign/C/String.hs: go [] n = n `seq` return () -- make it strict in n
./Foreign/C/String.hs: go [] n = n `seq` return () -- make it strict in n
./GHC/IO/Handle/Text.hs: c `seq` return ()
./GHC/Conc/Windows.hs: r `seq` return () -- avoid space leak
</pre><p>
and there are probably more elsewhere.
</p>
TicketsimonpjWed, 04 May 2011 06:48:14 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:7
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:7
<p>
Indeed. Every one of these should be call to <tt>evaluate</tt>; that is what <tt>evaluate</tt> is for! Ian: can you make it so?
</p>
<p>
Simon
</p>
TicketsimonpjWed, 04 May 2011 07:02:22 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:8
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:8
<p>
I'm just adding Don's comment and my response from ghc-users.
</p>
<p>
Don says: that's a very common idiom. Interestingly, we have:
</p>
<pre class="wiki">-- | Strict (call-by-value) application, defined in terms of 'seq'.
($!) :: (a -> b) -> a -> b
#ifdef __GLASGOW_HASKELL__
f $! x = let !vx = x in f vx -- see #2273
#elif !defined(__HUGS__) f
$! x = x `seq` f x
#endif
</pre><p>
Simon's response: It's very different, I think.
</p>
<ul><li><tt>evaluate</tt> is in the IO monad, and (should) guarantee to evaluate the argument before proceeding, so if evaluating the argument to WHNF diverges or throws a exception, any exceptions thrown (in the IO monad at least) after the 'evaluate' should not happen. For example:
<pre class="wiki">evaluate (throw "first") >> throwIO (userError "second")
</pre>should guaranteed to throw "first" and not "second". (The current bug is the 'evaluate' doesn't meet its guarantee.)
</li></ul><ul><li>Strict application <tt>($!)</tt> is not in the IO monad, so it merely makes a strictness guarantee. It doesn't guarantee to make exceptions in the argument "beat" exceptions in the function. For example
<pre class="wiki">((:) $! (throw "second")) $! (throw "first")
</pre>does <em>not</em> guaranteed to throw "first" rather than "second".
</li></ul><p>
Does that distinction make sense? Perhaps the contrast is a useful one. I wonder where it might be documented?
</p>
TicketrlWed, 04 May 2011 08:45:44 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:9
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:9
<p>
Why is <tt>x `seq` return ()</tt> discouraged? Why would I use <tt>evaluate</tt> if I want to make my code strict in <tt>x</tt> but don't care when it is evaluated? This seems to be what ezyang's examples are about (the ones in <tt>Foreign.C.String</tt> even say so explicitly) so I don't understand why these should be replaced with <tt>evaluate</tt>.
</p>
TicketsimonpjWed, 04 May 2011 09:15:04 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:10
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:10
<p>
OK fair enough. If all you want is strictness, then <tt>seq</tt> should be fine. If you want exception ordering, or space-leak squashing, then <tt>evaluate</tt> is better.
</p>
TicketezyangWed, 04 May 2011 10:05:28 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:11
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:11
<p>
I would like to understand this tradeoff better. I would like to be able to say "use evaluate whenever the IO monad is available" but it seems you might preclude some optimizations with this advice.
</p>
TicketsimonmarWed, 04 May 2011 12:07:23 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:12
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:12
<p>
Right, it's quite common to use
</p>
<pre class="wiki"> do ...
return $! e
</pre><p>
to avoid the thunk that would otherwise be created for <tt>e</tt>. My guess is that in most cases we don't care about the order of evaluation in these cases, so <tt>$!</tt> is fine.
</p>
<p>
It also occurs to me that the simplifier won't know how to optimise <tt>seqS#</tt> unless we teach it. For example, we want to eliminate <tt>seqS#</tt> when applied to a value.
</p>
TicketsimonmarThu, 09 Jun 2011 10:27:37 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:13
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:13
<p>
See also <tt>Note [Desugaring seq]</tt> (1) and (2) in <tt>DsUtils</tt>.
</p>
Ticketmarlowsd@…Tue, 28 Jun 2011 20:30:51 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:14
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:14
<pre class="wiki">commit be5441799b7d94646dcd4bfea15407883537eaaa
Author: Simon Marlow <marlowsd@gmail.com>
Date: Mon Jun 27 16:45:15 2011 +0100
Add two new primops:
seq# :: a -> State# s -> (# State# s, a #)
spark# :: a -> State# s -> (# State# s, a #)
seq# is a version of seq that can be used in a State#-passing
context. We will use it to implement Control.Exception.evaluate and
thus fix #5129. Also we have plans to use it to fix #5262.
spark# is to seq# as par is to pseq. That is, it creates a spark in a
State#-passing context. We will use spark# and seq# to implement rpar
and rseq respectively in an improved implementation of the Eval monad.
</pre>
TicketsimonmarWed, 06 Jul 2011 13:14:15 GMTstatus changed
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:15
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:15
<ul>
<li><strong>status</strong>
changed from <em>new</em> to <em>merge</em>
</li>
</ul>
<p>
We now have a test for this, and once that is merged this ticket is done. The remaining issue with seq is tracked in <a class="new ticket" href="http://ghc.haskell.org/trac/ghc/ticket/5262" title="bug: Compiling with -O makes some expressions too lazy and causes space leaks (new)">#5262</a>.
</p>
<pre class="wiki">commit 5ba6997081dd2cba642e61d21d362d050e388e7a
Author: Simon Marlow <marlowsd@gmail.com>
Date: Wed Jun 29 15:00:14 2011 +0100
Add test for #5129
</pre>
TicketiglooFri, 15 Jul 2011 11:56:35 GMTstatus changed; resolution set
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:16
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:16
<ul>
<li><strong>status</strong>
changed from <em>merge</em> to <em>closed</em>
</li>
<li><strong>resolution</strong>
set to <em>fixed</em>
</li>
</ul>
<p>
Merged
</p>
TicketsimonpjFri, 09 Jan 2015 10:33:59 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:17
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:17
<p>
A later note, in the light of <a class="ext-link" href="https://www.haskell.org/pipermail/ghc-devs/2015-January/007900.html"><span class="icon"></span>this email thread</a>.
Looking at the Haddock documentation for <tt>evaluate</tt>, Roman said: we seem to have three possible
implementations for <tt>evaluate</tt>:
</p>
<pre class="wiki">1) evaluate x = return $! x
2) evaluate x = (return $! x) >>= return
3) evaluate x = \s -> seq# x s
</pre><p>
Now (1) is too strict: <tt>(evaluate (error "urk") True)</tt> should yield <tt>True</tt>.
</p>
<p>
And, contrary to my claim in the email thread, (2) suffers from <a class="closed" href="http://ghc.haskell.org/trac/ghc/ticket/5129#comment:2" title="Comment 2 for Ticket #5129">comment:2</a> above: inlining can mean that the argument to <tt>evaluate</tt> never gets evaluated.
</p>
<p>
Consider, under (2)
</p>
<pre class="wiki">evaluate = \x -> (return $! x) >>= return
= \x. \s. case return $! x s of (s1, r) -> return r s1
= \x s. x `seq` case (s,x) of (s1, r) -> return r s1
= \x s. x `seq` (s,x)
= \x s. case x of _ -> (s,x)
</pre><p>
Now consider the call in <a class="closed" href="http://ghc.haskell.org/trac/ghc/ticket/5129#comment:2" title="Comment 2 for Ticket #5129">comment:2</a>:
</p>
<pre class="wiki">evalutate poss_err >> assertFailure "foo"`
= (inline >>)
\s -> case evaluate poss_err of (s1, _) -> assertFailure "foo" s1
= (inline evaluate)
\s -> case (case x of _ -> (s,x)) of (s1, _) -> assertFailure "foo" s1
= (case of case)
case x of _ -> assertFailure "foo" s1
</pre><p>
and now, as explained in <a class="closed" href="http://ghc.haskell.org/trac/ghc/ticket/5129#comment:2" title="Comment 2 for Ticket #5129">comment:2</a>, since <tt>assertFailure</tt> diverges, GHC feel free to discard the case on <tt>x</tt>.
</p>
<p>
Using a primop <tt>seq# :: a -> State# -> (State#, a)</tt> means that the I/O options that follows,
which consumes the state that comes out of the <tt>seq#</tt> simply can't start until the <tt>seq#</tt> has
executed. And that is what we want for <tt>evaluate</tt>.
</p>
<p>
So only (3) will do.
</p>
<p>
The claim in the Haddock docs for <tt>evaluate</tt> that (2) is a correct definition is, I think, false. Simon M added this claim in commit <tt>aa73318a</tt> in (<tt>base</tt> repo). I think I'll simply delete it.
</p>
TicketFeuerbachFri, 09 Jan 2015 13:11:35 GMT
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:18
http://ghc.haskell.org/trac/ghc/ticket/5129#comment:18
<p>
Thanks Simon. I'll try to digest what you've written, and based on my understanding I'll write a better documentation for evaluate that would explain the issue.
</p>
<p>
— Roman
</p>
Ticket