I've submitted a differential to change evaluate to work around the problem, but I don't like it very much. I'd love to have a primitive type of kind Type -> TYPE UnliftedRep that only holds things in WHNF; then we could make the primop return something of that type. But we don't have such a beast right now.
I think the right way for now is probably to stick this in simplAlt. When the scrutinee is seq# x s, we want to behave the way we would for a strict datacon field.
We still don't mark the argument to seq# as evaluated in the case branch:
do_<-evaluatexpure(Strx)
will still think it has to force x again. I suspect there are likely users doing such things. I think the fix should be really simple, but I don't know how to do it. We should (I believe) rewrite
caseseq#xsof(#s',x'#)->E
to
caseseq#xsof(#s',x'#)->E[x->x']
In principle, we could do something like that for spark# as well, but it's probably better to let threads fizzle than to let users rely on the optimizer to make their parallel code do what they expect.
Yes -- this is a variant of the case binder-swap in OccurAnal. See Note [Binder swap] in OccurAnal. This is the place to do it.
I eventually found that, but I'm not at all sure how to deal with coercions in that context. We could, for example, have something like
caseseq#xs`cast`...of(#s',x'#)->...
in which case we have to work out how to rejigger all the coercions. I don't know enough about that machinery yet. In the current OccurAnal code, the coercion in the scrutinee is always on a variable, but here it's on a pair containing the variable, so I'm not going to be able to code monkey it.
The key OccAnal functions involved seem to be mkAltEnv (not sure what this does, but it seems to look for Vars when I want to look also for applications of seq#) and wrapAltRHS (which seems to actually install the Let when appropriate), but I don't see how it all fits together.
I'm also not sure it's worth the trouble, although the current state of affairs seems a bit tricky to document in the Haddocks for evaluate. But considering Control. Parallel.Strategies.rdeepseq, I realized that even this binder swap, in combination with what I've already done, isn't really quite enough. Suppose we have
lete=x+3::Intincaseseq#esof(#s',e'#)->E
We'd actually like to know that not only e and e', but also x, are evaluated in E, because e is strict in x. So if we do a binder swap, we should do it for all the variables the scrutinee is strict in that are not already known to be evaluated.
So if we do a binder swap, we should do it for all the variables the scrutinee is strict
This is a bridge too far! Strictness analysis will work this out, I think. Eg that let e will turn into a case.
I don't think so. seq# is intentionally lazy in its argument, to allow explicit ordering in an IO context. This seems pretty important in combination with spark#, for example.
commit 502026fc0a35460c7f04b26a11320723a7bbfdffAuthor: David Feuer <david.feuer@gmail.com>Date: Mon Jun 11 10:32:23 2018 -0400 Make seq# evaluatedness look through casts In d964b05, I forgot to look through casts to find the `seq#` identifier. Fix that. Reviewers: bgamari Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4804>---------------------------------------------------------------502026fc0a35460c7f04b26a11320723a7bbfdff compiler/coreSyn/CoreSyn.hs | 3 ++- testsuite/tests/perf/should_run/{T15226.hs => T15226a.hs} | 5 ++++- testsuite/tests/perf/should_run/all.T | 9 +++++++++ 3 files changed, 15 insertions(+), 2 deletions(-)diff --git a/compiler/coreSyn/CoreSyn.hs b/compiler/coreSyn/CoreSyn.hs index 4dd70b0..50e40d1 100644--- a/compiler/coreSyn/CoreSyn.hs+++ b/compiler/coreSyn/CoreSyn.hs@@ -2046,10 +2046,11 @@ collectArgs expr go e as = (e, as) -- | Attempt to remove the last N arguments of a function call.--- Strip off any ticks encountered along the way and any ticks+-- Strip off any ticks or coercions encountered along the way and any -- at the end. stripNArgs :: Word -> Expr a -> Maybe (Expr a) stripNArgs !n (Tick _ e) = stripNArgs n e+stripNArgs n (Cast f _) = stripNArgs n f stripNArgs 0 e = Just e stripNArgs n (App f _) = stripNArgs (n - 1) f stripNArgs _ _ = Nothing diff --git a/testsuite/tests/perf/should_run/T15226.hs b/testsuite/tests/perf/should_run/T15226a.hssimilarity index 89%copy from testsuite/tests/perf/should_run/T15226.hscopy to testsuite/tests/perf/should_run/T15226a.hsindex 4c09114..6e9a1db 100644--- a/testsuite/tests/perf/should_run/T15226.hs+++ b/testsuite/tests/perf/should_run/T15226a.hs@@ -3,6 +3,7 @@ import Control.Exception (evaluate) -- Just in case Prelude.repeat changes for some reason. import Prelude hiding (repeat)+import Data.Coerce -- We want to be sure that the compiler *doesn't* know that -- all the elements of the list are in WHNF, because if it @@ -12,11 +13,13 @@ repeat a = res where res = a : res {-# NOINLINE repeat #-} -- Belt *and* suspenders+newtype Foo = Foo Int+ silly :: [Int] -> IO () silly = foldr go (pure ()) where go x r = do- x' <- evaluate x+ x' <- (coerce (evaluate :: Foo -> IO Foo) :: Int -> IO Int) x evaluate (x' + 3) -- GHC should know that x' has been evaluated, -- so this calculation will be erased entirely. -- Otherwise, we'll create a thunk to pass to diff --git a/testsuite/tests/perf/should_run/all.T b/testsuite/tests/perf/should_run/all.Tindex b248dd5..0e7996ef 100644--- a/testsuite/tests/perf/should_run/all.T+++ b/testsuite/tests/perf/should_run/all.T@@ -584,3 +584,12 @@ test('T15226', only_ways(['normal'])], compile_and_run, ['-O'])++test('T15226a',+ [stats_num_field('bytes allocated',+ [ (wordsize(64), 41040, 5) ]),+ # 2018-06-06 41040 Look through casts for seq#+ # initial 400041040+ only_ways(['normal'])],+ compile_and_run,+ ['-O'])
seq# is intentionally lazy in its argument, to allow explicit ordering in an IO context
Hmnm. Can you give an example? Nothing in seq#'s documentation says that. It jolly well should!
Considering seq# strict can be rather bad, I believe. If we turn .... seq# x s into case x of x' {DEFAULT__ -> .... seq# x' s} then we'll see that x' is evaluated and erase the seq#. That sort of thing is the very sort of trouble seq# was intended to avoid.
\s. let x = blah in let y = x+1 in case seq# x s of (# s', x' #) -> ...
It seems fine to me transform this to
\x. case blah of x -> let y = x+1 in case seq# x s of (# s', x' #) -> ...
What if the seq# is after some other IO operations thus:
\s. let x = blah in case f s of (# s1, r #) -> case seq# x s of (# s2, x' #) -> ......
Now you might worry that x might be evaluated (and throw an exception) before
f gets a chance to run. But it doesn't: there's a hack in the strictness analyser
(see See Note [IO hack in the demand analyser] in DmdAnal) that will make
x's binding lazy; in effect the strictness analyser treats the case f s of ...
as if it had an extra invisible alternative not mentioning x.
It's not that important. But I think that seq# can safely be strict in x.
I think I agree that seq# could be strict in its first argument. Of course, we'd give it a demand signature of arity 2, reflecting that seq# undefined `seq` \s -> s must not throw whereas seq# undefined s might, as required by the documentation on evaluate.
I think that doing so will also probably make the Simplifier case-bind the argument x and use the case-binder x' in the occurrences of x, and x' is automatically marked evaluated. So it should be enough to give seq# a demand signature in primops.txt.pp.
err1 and err2 must be pure computations which may throw imprecise exceptions, but never precise ones.
So indeed, it would be a possible imprecise exception semantics trace to have err2 throw its imprecise exception before err1 had a chance. But that is something that is true today, the compiler simply doesn't exploit it.
That's just not what I'd expect of seq#. I expect precise behavior—the argument is forced when it runs and not before. What do you think it's supposed to mean?
So indeed, it would be a possible imprecise exception semantics trace to have err2 throw its imprecise exception before err1 had a chance. But that is something that is true today, the compiler simply doesn't exploit it.
@sgraf812 I'm not sure about that. See Note [Precise exceptions and strictness analysis] in Demand, especially Scenario 2: Precise exceptions in case scrutinees. Reading that Note, the question is: does seq# fall under Note [Which scrutinees may throw precise exceptions] (in DmdAnal)? In the latter note we say
For an expression @f a1 ... an :: ty@ we determine that 1. False If ty is *not* @State# RealWorld@ or an unboxed tuple thereof. This check is done by 'forcesRealWorld'. (Why not simply unboxed pairs as above? This is motivated by T13380{d,e}.) 2. False If f is a PrimOp, and it is *not* raiseIO# 3. False If f is an unsafe FFI call ('PlayRisky') _. True Otherwise "give up".
but we could prefectly well adapt (2) to say ...raiseIO# or seq#. I have some sympathy with this because as David says, the whole purpose of seq# is to make imprecise exceptions precise!
There is no raiseIO# in the entire program, so there are no precise exceptions and the Note does not apply.
I too find that unfortunate, but have a look at this Note !9874 (diffs) for why that is.
We could talk about doing a deferAfterPreciseException not only for raiseIO# but also for seq# if we were inclined to say that seq# is side-effecting, but seq# really is strict in its argument! There is no wiggle-room on that.
What would deferring mean in practice? Simply that evaluate x >> a is not strict in any strict variable of a, achieving the semantics that David wants.
But as the Note above points out, we have to carefully check whether we introduced regressions in doing so. And perhaps think about giving other, actually side-effecting primops a similar treatment. Have a look at the example in #20749 (comment 474214), I find that much more daunting than preferring one imprecise exception over another.
I've created another issue (#22935) for this discussion.
In the meantime, the exact original issue has re-surfaced (but in STG rather than Core) since the tag inference used to enforce the new strict field invariant does not know about seq#. I noticed and fixed this in version 1 of !9359 and have added a test looking at the final STG for a similar program.