Opened 6 years ago
Last modified 14 months ago
#7542 new bug
GHC doesn't optimize (strict) composition with id
Reported by: | shachaf | Owned by: | simonpj |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Compiler | Version: | 7.6.1 |
Keywords: | Cc: | lelf, ekmett@…, jwlato@…, hackage.haskell.org@…, pho@…, dfeuer, wren@… | |
Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
Type of failure: | Runtime performance bug | Test Case: | |
Blocked By: | Blocking: | ||
Related Tickets: | Differential Rev(s): | ||
Wiki Page: |
Description
Newtype constructors and selectors have no runtime overhead, but some uses of them do. For example, given newtype Identity a = Identity { runIdentity :: a }
, Identity
turns into id
, but Identity . f
turns into id . f
, which is distinct from f
, because it gets eta-expanded to \x -> f x
.
It would be nice to be able to compose a newtype constructor with a function without any overhead. The obvious thing to try is strict composition:
(#) :: (b -> c) -> (a -> b) -> a -> c (#) f g = f `seq` g `seq` \x -> f (g x)
In theory this should get rid of the eta-expansion. In practice, the generated Core looks like this:
foo :: (a -> b) -> [a] -> [b] foo f = map (id # f) -- becomes foo = \f e -> map (case f of g { __DEFAULT -> \x -> g x }) e
Different variations of (#)
, and turning -fpedantic-bottoms
on, don't seem to affect this. A simpler version, foo f = map (f
seq \x -> f x)
, generates the same sort of Core.
In one library we resorted to defining a bunch of functions of the form identityDot :: (a -> b) -> a -> Identity b; identityDot = unsafeCoerce
. It would be better to be able to rely on GHC to do the optimization directly, if we use strict composition anyway.
Change History (23)
comment:1 Changed 6 years ago by
difficulty: | → Unknown |
---|---|
Status: | new → infoneeded |
comment:2 Changed 6 years ago by
Status: | infoneeded → new |
---|
Here's an example of the sort of context this comes up in:
module T7542 where import Unsafe.Coerce newtype Id a = MkId { unId :: a } -- Think of `mapped` as `mapM`, but restricted to Id (we could make it work -- with any Functor, rather than just []). `over` takes the Id wrappers back -- off. The goal is to make it easy to compose mapped with other functions of -- the same form. The wrapper should be "free" because it's just newtype noise. mapped1 :: (a -> Id b) -> [a] -> Id [b] mapped1 f = MkId . map (unId . f) over1 :: ((a -> Id b) -> s -> Id t) -> (a -> b) -> s -> t over1 l f = unId . l (MkId . f) map1 :: (a -> b) -> [a] -> [b] map1 f xs = over1 mapped1 f xs -- Core: map1 = \f xs -> map (\x -> f x) xs -- over1 mapped1 = unId . MkId . map (unId . MkId . f) -- ~ map -- However, if f = ⊥, unId . MkId . f /= f! -- Therefore `over1 mapped1` must turn into \f -> map (\x -> f x) -- We can't expect GHC to compile it to `map` because it has different strictness. -- Let's define strict versions of (MkId .) and (unId .): mkIdDot2 :: (a -> b) -> a -> Id b mkIdDot2 f = f `seq` \x -> MkId (f x) unIdDot2 :: (a -> Id b) -> a -> b unIdDot2 f = f `seq` \x -> unId (f x) mapped2 :: (a -> Id b) -> [a] -> Id [b] mapped2 f = mkIdDot2 (map (unIdDot2 f)) over2 :: ((a -> Id b) -> s -> Id t) -> (a -> b) -> s -> t over2 l f = unIdDot2 (l (mkIdDot2 f)) map2 :: (a -> b) -> [a] -> [b] map2 f xs = over2 mapped2 f xs -- map2 should have the same semantics as map. But the Core isn't the same: -- Without -fpedantic-bottoms: map2 = \f xs -> map (\e -> f e) xs -- With -fpedantic-bottoms: -- map2 = \f xs -> map (case f of g { __DEFAULT -> \x -> g x }) xs -- Ideally, (case f of g { __DEFAULT -> \x -> g x }) would simply be f. -- Let's try manually telling GHC that our newtype compositions are coercions: -- (Ideally, this is what mkIdDot2 and unIdDot2 would compile into.) mkIdDot3 :: (a -> b) -> a -> Id b mkIdDot3 = unsafeCoerce unIdDot3 :: (a -> Id b) -> a -> b unIdDot3 = unsafeCoerce -- (Note: Due to #7398, we couldn't define a strict composition operator and -- rely on RULES to turn (MkId `dot`) into unsafeCoerce -- the `MkId` itself -- gets turned into a coercion before any RULES have a chance to fire.) mapped3 :: (a -> Id b) -> [a] -> Id [b] mapped3 f = mkIdDot3 (map (unIdDot3 f)) over3 :: ((a -> Id b) -> s -> Id t) -> (a -> b) -> s -> t over3 l f = unIdDot3 (l (mkIdDot3 f)) map3 :: (a -> b) -> [a] -> [b] map3 f xs = over3 mapped3 f xs -- Core: map3 = map
comment:3 Changed 6 years ago by
Cc: | ekmett@… added |
---|
comment:4 Changed 6 years ago by
Cc: | jwlato@… added |
---|
comment:5 Changed 6 years ago by
Cc: | hackage.haskell.org@… added |
---|
comment:6 Changed 6 years ago by
Cc: | pho@… added |
---|
comment:7 Changed 5 years ago by
commit 35f1fc957d152c520c90c6bd2330266e57578eb2
Author: Simon Peyton Jones <simonpj@microsoft.com> Date: Tue Jan 22 22:46:33 2013 +0000 Allow CaseElim if the case binder is the next thing to be eval'd This makes CaseElim happen a bit more often. See Note [Case binder next] in Simplify. This came up when investigating Trac #7542. compiler/simplCore/Simplify.lhs | 32 ++++++++++++++++++++++++++------ 1 files changed, 26 insertions(+), 6 deletions(-)
comment:8 Changed 5 years ago by
Also this:
commit 7a1480c7c590d4d2fa7a105a4eebf299e921e056 Author: Simon Peyton Jones <simonpj@microsoft.com> Date: Tue Jan 22 22:43:22 2013 +0000 Allow eta-reduction of eval'd functions if of arity 1 See Note [Eta reduction of an eval'd function] in CoreUtils. This doesn't fix Trac #7542, but that was the ticket that pointed out this infelicity. >--------------------------------------------------------------- compiler/coreSyn/CoreUtils.lhs | 24 ++++++++++++++++++++++-- 1 files changed, 22 insertions(+), 2 deletions(-) diff --git a/compiler/coreSyn/CoreUtils.lhs b/compiler/coreSyn/CoreUtils.lhs index 7017f70..9b527e7 100644 --- a/compiler/coreSyn/CoreUtils.lhs +++ b/compiler/coreSyn/CoreUtils.lhs @@ -1712,8 +1712,14 @@ tryEtaReduce bndrs body --------------- fun_arity fun -- See Note [Arity care] - | isLocalId fun && isStrongLoopBreaker (idOccInfo fun) = 0 - | otherwise = idArity fun + | isLocalId fun + , isStrongLoopBreaker (idOccInfo fun) = 0 + | arity > 0 = arity + | isEvaldUnfolding (idUnfolding fun) = 1 + -- See Note [Eta reduction of an eval'd function] + | otherwise = 0 + where + arity = idArity fun --------------- ok_lam v = isTyVar v || isEvVar v @@ -1737,6 +1743,20 @@ tryEtaReduce bndrs body ok_arg _ _ _ = Nothing \end{code} +Note [Eta reduction of an eval'd function] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +In Haskell is is not true that f = \x. f x +because f might be bottom, and 'seq' can distinguish them. + +But it *is* true that f = f `seq` \x. f x +and we'd like to simplify the latter to the former. This amounts to +the rule that + * when there is just *one* value argument, + * f is not bottom +we can eta-reduce \x. f x ===> f + +This turned up in Trac #7542.
comment:9 Changed 5 years ago by
The previous two commits improve the situation a bit; both were triggered by looking at the code from this example, thanks.
I'm far from sure that we get perfect code now. But I'm also convinced that this is the wrong way to solve this problem: check out NewtypeWrappers.
Meanwhile do have a go with HEAD and see if you get better code now.
Simon
comment:10 Changed 5 years ago by
Milestone: | → 7.8.1 |
---|---|
Owner: | set to simonpj |
Simon, should this ticket be closed now?
comment:11 Changed 4 years ago by
The simple example from the ticket:
(#) :: (b -> c) -> (a -> b) -> a -> c (#) f g = f `seq` g `seq` \x -> f (g x) foo :: (a -> b) -> [a] -> [b] foo f = map (id # f)
produces
T7542.foo = \ (@ a) (@ b) (f :: a -> b) (eta :: [a]) -> GHC.Base.map @ a @ b (\ (eta1 :: a) -> f eta1) eta
without -fpedantic-bottoms
and
T7542.foo = GHC.Base.map
with, which looks fine. The same happens to the more elaborate example in comment:2 (middle section), -fpedantic-bottoms
preserves the semantics and with that we get map2 = map
.
So it seems that situation has improved and we can close the ticket (which should not discourage anyone who finds a situation that needs fixing to open it again).
comment:12 Changed 4 years ago by
@nomeata:
As I understand the situation, the fact that this optimization isn't happening without -fpedantic-bottoms
means I'm basically stuck using the existing hacks in lens
.
The code in question is set up to INLINE
in library into user code, and they very likely aren't compiling with that flag! (Unless I'm misunderstanding how and where that gets tracked and applid.)
We have 3 cases to consider, the existing unsafeCoerce hack, the naive eta expansion we're getting without -fpedantic-bottoms
, and the version without the eta expansion we can produce with -fpedantic-bottoms
. It appears that the first and the third scenarios are now the same, but the difference in performance between the existing unsafeCoerce hack and the eta expanded code can be asymptotic, if this is closed as is, the fix does me no good, and I'll be left using the existing coercions to ensure a good user experience. =/
comment:13 Changed 4 years ago by
@ekmett: Ok, thanks for the input. So clearly it should _not_ be closed then.
So what you’d like to see is to have GHC behave here as if -fpedantic-bottoms
is on always.
Obviously, one could make that flag the default, but there must be reasons why it is not the default.
What I find surprising: The core generated for
mkIdDot2 :: (a -> b) -> a -> Id b mkIdDot2 f = f `seq` \x -> MkId (f x)
is
mkIdDot2 = (λ f. f) |> (some cast involving Id))
which is the same core that we get with mkIdDot2 = coerce
(hey, no unsafe!), but the latter produces the desired map2 = map
... hard to see (without knowing the simplifier by heart) where this goes wrong.
comment:14 Changed 4 years ago by
Ah, interesting. The problem is not mkIdDot2
(that has Arity 1 allright), the problem is unIdDot2
, which suddenly as artiy 2!
T7542.unIdDot2 :: forall a_az8 b_az9. (a_az8 -> T7542.Id b_az9) -> a_az8 -> b_az9 [LclIdX, Arity=2, Str=DmdType, Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=2, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [20 0] 50 0}] T7542.unIdDot2 = \ (@ a_aKU) (@ b_aKV) (f_aze :: a_aKU -> T7542.Id b_aKV) (eta_B1 :: a_aKU) -> case f_aze of f_Xzl { __DEFAULT -> T7542.unId @ b_aKV (f_Xzl eta_B1) }
(This is already after InitialPhase [Gentle]
). One would expect MkId
and unId
to behave the same...
comment:15 Changed 4 years ago by
Tracing this back I found the Note [Dealing with bottom]
in CoreArity
, where the arity of case branches is used to calculate the arity of a case. Quote:
We ignore this "problem" (unless -fpedantic-bottoms is on), because being scrupulous would lose an important transformation for many programs. (See Trac #5587 for an example.)
So the next step would be answering: Can we distinguish between cases where this dodgy optimization is useful and required, and cases where it hurts?
comment:17 Changed 4 years ago by
Cc: | dfeuer added |
---|
comment:18 Changed 4 years ago by
Milestone: | 7.10.1 → 7.12.1 |
---|
Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.
comment:20 Changed 3 years ago by
Cc: | wren@… added |
---|
comment:21 Changed 2 years ago by
Milestone: | 8.0.1 |
---|
comment:22 Changed 21 months ago by
Cc: | lelf added |
---|
Can you give a concrete example? With this module
compiled with
ghc-7.6 -O -ddump-stg
I getwhich looks fine to me.