Opened 11 years ago
Last modified 3 years ago
#2273 new bug
inlining defeats seq
Reported by: | igloo | Owned by: | |
---|---|---|---|
Priority: | lowest | Milestone: | |
Component: | Compiler | Version: | 6.9 |
Keywords: | Cc: | ezyang@… | |
Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
Type of failure: | Runtime performance bug | Test Case: | |
Blocked By: | Blocking: | ||
Related Tickets: | Differential Rev(s): | ||
Wiki Page: |
Description
Consider this module:
module Q (tcExtendIdEnv2) where -- Interesting code: tcExtendIdEnv2 :: M a tcExtendIdEnv2 = do env <- getEnv let level :: Int level = thLevel (tcl_th_ctxt env) level `seq` tc_extend_local_id_env level {-# NOINLINE tc_extend_local_id_env #-} tc_extend_local_id_env :: Int -> M a tc_extend_local_id_env th_lvl = if read "foo" then th_lvl `seq` return undefined else return undefined thLevel :: ThStage -> Int thLevel Comp = 0 thLevel (Splice l) = l thLevel (Brack l) = l -- Dull code: type M a = IOEnv TcLclEnv a data TcLclEnv = TcLclEnv { tcl_th_ctxt :: !ThStage } data ThStage = Comp | Splice Int | Brack Int getEnv :: IOEnv env env getEnv = IOEnv (\ env -> return env) newtype IOEnv env a = IOEnv { unIOEnv :: env -> IO a } instance Monad (IOEnv m) where IOEnv m >>= f = IOEnv (\ env -> do r <- m env unIOEnv (f r) env ) return a = IOEnv (\ _ -> return a) fail = error
Compiling with
ghc -O -ddump-simpl -ddump-stg -c Q.hs
we get, in the STG,
Q.$wa = \r srt:(0,*bitmap*) [ww_sDx w_sDO] case case ww_sDx of wild_sEs { Q.Comp -> Q.lvl; Q.Splice l_sDA -> l_sDA; Q.Brack l_sDC -> l_sDC; } of tpl_sEt { GHC.Base.I# ipv_sEu -> let { sat_sDN = NO_CCS Q.TcLclEnv! [ww_sDx]; } in let { sat_sDL = \u [] case ww_sDx of wild_sEv { Q.Comp -> Q.lvl; Q.Splice l_sDH -> l_sDH; Q.Brack l_sDJ -> l_sDJ; }; } in Q.tc_extend_local_id_env sat_sDL sat_sDN w_sDO; };
GHC seems to have inlined level
, forced it (due to the seq), but passed along a second, inlined, unevaluated copy to tc_extend_local_id_env
. So the whole environment is retained, defeating the purpose of the seq!
If I mark level
as NOINLINE
then the STG looks like this:
Q.a5 = \r srt:(0,*bitmap*) [env_sD1 eta_sDh] case env_sD1 of tpl_sDg { Q.TcLclEnv ipv_sD5 -> case case ipv_sD5 of wild_sDN { Q.Comp -> Q.lvl; Q.Splice l_sD8 -> l_sD8; Q.Brack l_sDa -> l_sDa; } of level_sDc { __DEFAULT -> case level_sDc of tpl1_sDf { GHC.Base.I# ipv1_sDO -> Q.tc_extend_local_id_env tpl1_sDf tpl_sDg eta_sDh; }; }; };
which fixes the env-retained problem, although I don't understand why two cases are done.
It would be nice not to have to resort to this level of trickery, though!
Attachments (1)
Change History (31)
comment:1 Changed 11 years ago by
comment:2 Changed 11 years ago by
Owner: | set to igloo |
---|---|
Type: | bug → merge |
Fixed by
Fri May 16 09:51:49 GMT Daylight Time 2008 simonpj@microsoft.com * Improve the treatment of 'seq' (Trac #2273) Trac #2273 showed a case in which 'seq' didn't cure the space leak it was supposed to. This patch does two things to help a) It removes a now-redundant special case in Simplify, which switched off the case-binder-swap in the early stages. This isn't necessary any more because FloatOut has improved since the Simplify code was written. And switching off the binder-swap is harmful for seq. However fix (a) is a bit fragile, so I did (b) too: b) Desugar 'seq' specially. See Note [Desugaring seq (2)] in DsUtils This isn't very robust either, since it's defeated by abstraction, but that's not something GHC can fix; the programmer should use a let! instead. M ./compiler/basicTypes/MkId.lhs -6 +14 M ./compiler/deSugar/DsUtils.lhs -9 +44 M ./compiler/simplCore/Simplify.lhs -12 +15
Most of the new lines are comments!
I believe this could usefully be moved to the branch too.
I wonder if it'd be worth a test in eyeball
?
Simon
comment:3 Changed 11 years ago by
are the following equivalent?
a `fseq` \a' -> b `fseq` \b' -> (a',b')
b `fseq` \b' -> a `fseq` \a' -> (a',b')
(I guess this is the usual question where the order of strictness doesn't matter because "imprecise" exceptions? but the order might have non-obvious performance implications? It's simpler than let
because it's not mutually/recursive by default...)
comment:4 Changed 11 years ago by
They are semantically equivalent, but could perhaps have different space behaviour.
Simon
comment:6 Changed 10 years ago by
Architecture: | Unknown → Unknown/Multiple |
---|
comment:7 Changed 10 years ago by
Operating System: | Unknown → Unknown/Multiple |
---|
comment:8 Changed 9 years ago by
Resolution: | fixed |
---|---|
Status: | closed → reopened |
See also this thread, http://www.haskell.org/pipermail/cvs-ghc/2009-September/050164.html, which describes another instance of the very same thing. Furthermore, this instance is NOT FIXED by the above patch. So I'm re-opening.
Simon
comment:9 Changed 9 years ago by
Owner: | igloo deleted |
---|---|
Status: | reopened → new |
comment:10 Changed 9 years ago by
Note to self. I think the Right Solution is simply not to inline things that are "cheap", but only things that are "values". This is easy to achieve:
hunk ./compiler/coreSyn/CoreUnfold.lhs 637 - yes_or_no = active_inline && is_cheap && consider_safe + yes_or_no = active_inline && is_value && consider_safe
But it's not quite so easy: this change makes allocation go up in a couple of programs, and runtime goes up quite a bit. (We'd need to double-check that the runtime figures are right.)
-------------------------------------------------------------------------------- Program Size Allocs Runtime Elapsed -------------------------------------------------------------------------------- fulsom -2.8% +17.2% +39.1% +39.8% puzzle +1.0% +8.1% +15.2% +18.6% atom +0.7% +7.3% +55.7% +64.7% circsim +0.5% -0.0% +35.2% +37.1% compress2 +0.7% -0.0% +25.8% +29.9% lcss +0.7% -0.0% +42.8% +48.1% -------------------------------------------------------------------------------- Min -2.8% -4.3% -4.8% -10.0% Max +1.0% +17.2% +55.7% +64.7% Geometric Mean +0.5% +0.5% +18.2% +21.4%
So, annoyingly, more investigation required. I can't do it today, so I'm recording the breadcrumbs here for when I get back to it.
Simon
comment:11 Changed 9 years ago by
Type: | merge → bug |
---|
comment:12 Changed 9 years ago by
Milestone: | → 6.12 branch |
---|
comment:13 Changed 9 years ago by
Milestone: | 6.12 branch → 6.12.3 |
---|
comment:14 Changed 9 years ago by
Milestone: | 6.12.3 → 6.14.1 |
---|---|
Priority: | normal → low |
comment:15 Changed 8 years ago by
Milestone: | 7.0.1 → 7.0.2 |
---|
comment:16 Changed 8 years ago by
Milestone: | 7.0.2 → 7.2.1 |
---|
comment:17 Changed 8 years ago by
Cc: | ezyang@… added |
---|---|
Type of failure: | → None/Unknown |
comment:18 Changed 7 years ago by
Milestone: | 7.2.1 → 7.4.1 |
---|
comment:19 Changed 7 years ago by
Milestone: | 7.4.1 → 7.6.1 |
---|---|
Priority: | low → lowest |
comment:20 Changed 6 years ago by
Milestone: | 7.6.1 → 7.6.2 |
---|
Changed 6 years ago by
comment:21 Changed 6 years ago by
Seems to work as desired with 7.6.1:
ghc -O -ddump-simpl -ddump-stg -c T2273.hs
yields (in the STG code):
[GblId, Arity=2, Str=DmdType S(S)L, Unf=OtherCon []] = \r srt:(0,*bitmap*) [env_sxC s_sxP] case env_sxC of wild_sxO { Q.TcLclEnv ds_sxF -> case case ds_sxF of _ { Q.Comp -> Q.tcExtendIdEnv3; Q.Splice l_sxI -> l_sxI; Q.Brack l_sxK -> l_sxK; } of level_sxN { GHC.Types.I# ipv_syf -> Q.tc_extend_local_id_env level_sxN wild_sxO s_sxP; }; };
comment:22 Changed 6 years ago by
The ticket was reopened for a different example: http://hackage.haskell.org/trac/ghc/ticket/2273#comment:8
comment:24 Changed 4 years ago by
commit 970816ac0028f2f42ac4140d29e2f0dfe0e9af3e
Author: Simon Marlow <> Date: Wed Sep 16 14:04:54 2009 +0000 Use let !y = x in .. x .. instead of seq in $! and evaluate (#2273)
comment:25 Changed 4 years ago by
Type of failure: | None/Unknown → Runtime performance bug |
---|
SPJ: what is the status of this ticket? You reopened it in comment:8 and comment:10, but this ticket is not on Status/SLPJ-Tickets.
The link in comment:8 is unfortunately broken.
The following tickets all link here: #2457, #3263, #5129, #9127, #9353.
comment:26 Changed 4 years ago by
I've seen some remarks in the ghc source tree about having the desugaring use seq#, but I didnt see any usage of it in ghc proper as far as I could tell.
comment:27 Changed 4 years ago by
Milestone: | 7.10.1 → 7.12.1 |
---|
Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.
comment:28 Changed 4 years ago by
Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.
comment:30 Changed 3 years ago by
Milestone: | 8.0.1 |
---|
Thanks Ian -- nice report. What is happening is this. After inlining seq we get something like this:
Now GHC thinks that the case-expression for
level
is cheap (which it is), and therefore ok to duplicate, so it inlines itNow there's only one remaining occurrence of
level
so that gets inlined too. Disaster.This example has made me realise that what is really wrong here is that
seq
should really have a type more likeThen we'd have
Now the inner
level
is nothing to do with the outerlevel
and all will be well. So here we are saying what we mean: "evaluatelevel
and use the evaluated version inside here".This version of
seq
is just reversed strict function application, of course. Which is very like strictlet
. So with bang patterns we could also writeThis is satisfactorily explicit, but we must use a new name (
level2
). Perhaps that's not unreasonable.Both
fseq
and a strict let could desugar toMeanwhile I think a good fix will be to change the desugarer to desugar saturated applications of
seq
to this same form. Not very robust to abstraction, but better than what we have now.seq
is subtler than it looks.