= Seq magic =
The innocent-looking `seq` operator causes all manner of mayhem in GHC. This page summarises the issues. See also discussion in Trac #5129, #5262
== The baseline position ==
Our initial story was that `(seq e1 e2)` meant precisely
{{{
case e1 of { _ -> e2 }
}}}
Indeed this was `seq`'s inlining. This translation validates some important rules
{{{
* `seq` is strict in both its arguments
* (e1 `seq` e2) e3 ===> e1 `seq` (e2 e3)
* case (e1 `seq` e2) of alts ===> e1 `seq` (case e2 of alts)
* value `seq` e ===> e
}}}
But this approach has problems; see `Note [Deguaring seq]` in `DsUtils`.
=== Problem 1 (Trac #1031) ===
Consider
{{{
f x y = x `seq` (y `seq` (# x,y #))
}}}
The `[CoreSyn let/app invariant]` (see `CoreSyn`) means that, other things being equal, because
the argument to the outer `seq` has an unlifted type, we'll use call-by-value thus:
{{{
f x y = case (y `seq` (# x,y #)) of v -> x `seq` v
}}}
But that is bad for two reasons:
* we now evaluate `y` before `x`, and
* we can't bind `v` to an unboxed pair
Seq is very, very special! Treating it as a two-argument function, strict in
both arguments, doesn't work. We "fixed" this by treating `seq` as a language
construct, desugared by the desugarer, rather than as a function that may (or
may not) be inlined by the simplifier. So the above term is desugared to:
{{{
case x of _ -> case y of _ -> (# x,y #)
}}}
=== Problem 2 (Trac #2273) ===
Consider
{{{
let chp = case b of { True -> fst x; False -> 0 }
in chp `seq` ...chp...
}}}
Here the `seq` is designed to plug the space leak of retaining `(snd x)`
for too long.
If we rely on the ordinary inlining of `seq`, we'll get
{{{
let chp = case b of { True -> fst x; False -> 0 }
case chp of _ { I# -> ...chp... }
}}}
But since `chp` is cheap, and the case is an alluring contet, we'll
inline `chp` into the case scrutinee. Now there is only one use of `chp`,
so we'll inline a second copy. Alas, we've now ruined the purpose of
the seq, by re-introducing the space leak:
{{{
case (case b of {True -> fst x; False -> 0}) of
I# _ -> ...case b of {True -> fst x; False -> 0}...
}}}
We can try to avoid doing this by ensuring that the binder-swap in the
case happens, so we get his at an early stage:
{{{
case chp of chp2 { I# -> ...chp2... }
}}}
But this is fragile. The real culprit is the source program. Perhaps we
should have said explicitly
{{{
let !chp2 = chp in ...chp2...
}}}
But that's painful. So the desugarer does a little hack to make `seq`
more robust: a saturated application of `seq` is turned '''directly''' into
the case expression, thus:
{{{
x `seq` e2 ==> case x of x -> e2 -- Note shadowing!
e1 `seq` e2 ==> case x of _ -> e2
}}}
So we desugar our example to:
{{{
let chp = case b of { True -> fst x; False -> 0 }
case chp of chp { I# -> ...chp... }
}}}
And now all is well.
Be careful not to desugar
{{{
True `seq` e ==> case True of True { ... }
}}}
which stupidly tries to bind the datacon 'True'. This is easily avoided.
The whole thing is a hack though; if you define `mySeq=seq`, the hack
won't work on `mySeq`.
=== Problem 3 (Trac #5262) ===
Consider
{{{
f x = x `seq` (\y.y)
}}}
With the above desugaring we get
{{{
f x = case x of x { _ -> \y.y }
}}}
and now ete expansion gives
{{{
f x y = case x of x { _ -> y }
}}}
Now suppose that we have
{{{
f (length xs) `seq` 3
}}}
Plainly `(length xs)` should be evaluated... but it isn't because `f` has arity 2.
(Without -O this doesn't happen.)
=== Problem 4: seq in the IO monad ===
See the extensive discussion in Trac #5129.
=== Problem 5: the need for special rules ===
Roman found situations where he had
{{{
case (f n) of _ -> e
}}}
where he knew that `f` (which was strict in `n`) would terminate if n did.
Notice that the result of `(f n)` is discarded. So it makes sense to
transform to
{{{
case n of _ -> e
}}}
Rather than attempt some general analysis to support this, I've added
enough support that you can do this using a rewrite rule:
{{{
RULE "f/seq" forall n e. seq (f n) e = seq n e
}}}
You write that rule. When GHC sees a case expression that discards
its result, it mentally transforms it to a call to `seq` and looks for
a RULE. (This is done in `Simplify.rebuildCase`.) As usual, the
correctness of the rule is up to you.
To make this work, we need to be careful that `seq` is '''not''' desguared
into a case expression on the LHS of a rule.
To increase applicability of these user-defined rules, we also
have the following built-in rule for `seq`
{{{
seq (x |> co) y = seq x y
}}}
This eliminates unnecessary casts and also allows other seq rules to
match more often. Notably,
{{{
seq (f x |> co) y --> seq (f x) y
}}}
and now a user-defined rule for `seq` may fire.
= A better way =
Here's our new plan.
* Introduce a new primop `seq# :: a -> State# s -> (# a, State# s #)` (see be5441799b7d94646dcd4bfea15407883537eaaa)
* Implement `seq#` by turning it into the obvious eval in the backend. In fact, since the return convention for `(# State# s, a #)` is exactly the same as for `a`, we can implement `seq# s a` by `a` (even when it appears as a case scrutinee).
* Define `evaluate` thus
{{{
evaluate :: a -> IO a
evaluate x = IO $ \s -> seq# x s
}}}
That fixes problem 4.
We could go on and desugar `seq` thus:
{{{
x `seq` e2 ==> case seq# x RW of (# x, _ #) -> e2 -- Note shadowing!
e1 `seq` e2 ==> case seq# x RW of (# _, _ #) -> e2
}}}
and if we consider `seq#` to be expensive, then we won't eta-expand around it, and that would fix problem 3.
However, there is a concern that this might lead to performance regressions in examples like this:
{{{
f :: Int -> Int -> IO Int
f x y | x `seq` False = undefined
f x 3 = do
... some IO monad code here ...
}}}
so `f` turns into
{{{
f = \x . \y . case seq# x RW of (# _, x #) -> case y of 3 -> \s . some IO monad code
}}}
and we won't get to eta-expand the `\s` as we would normally do (this is pretty important for getting good performance from IO and ST monad code).
Arguably `f` should be rewritten with a bang pattern, and we should treat bang patterns as the eta-expandable seq and translate them directly into `case`, not `seq#`. But this would be a subtle difference between `seq` and bang patterns.
Furthermore, we already have `pseq`, which is supposed to be a "strictly ordered seq", that is it preserves evaluation order. So perhaps `pseq` should be the one that more accurately implements the programmer's intentions, leaving `seq` as it currently is.
We are currently pondering what to do here.