Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#10528 closed bug (fixed)

compile time performance regression with OverloadedStrings and Text

Reported by: jakewheat Owned by:
Priority: high Milestone: 7.10.3
Component: Compiler Version: 7.10.2-rc2
Keywords: Cc: RyanGlScott, Roboguy, bgamari
Operating System: Linux Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by thomie)

There is a big performance regression in the compile time from ghc 7.10.1 to ghc 7.10.1.20150612

I believe it is in this file which has a huge literal value which also contains overloaded strings using Text:

https://github.com/JakeWheat/hssqlppp/blob/master/hssqlppp/src/Database/HsSqlPpp/Internals/Catalog/DefaultTemplate1Catalog.lhs

time cabal build with ghc 7.10.1

real	1m20.449s
user	2m5.040s
sys	0m48.504s

time cabal build with ghc 7.10.1.20150612

real	9m3.447s
user	12m19.704s
sys	3m25.724s

I am running debian 64 bit unstable with the ghc binary tarballs from here: https://www.haskell.org/ghc/

Change History (43)

comment:1 Changed 2 years ago by thomie

Description: modified (diff)

comment:2 Changed 2 years ago by thomie

Milestone: 7.10.3
Priority: normalhigh
Summary: compile time performance regression on big literalcompile time performance regression with OverloadedStrings and Text
Version: 7.10.2-rc2

I can reproduce this with the test below. Compiling with 7.10.2 takes more than 10x longer than with 7.10.1.

{-# LANGUAGE OverloadedStrings #-}

module T10528 where

import Data.Text (Text)

strings :: [Text]
strings = [
  "abstime", "aclitem", "bit", "bool", "box", "bpchar", "bytea", "char",
  "cid", "cidr", "circle", "date", "float4", "float8", "gtsvector", "inet",
  "interval", "json", "jsonb", "line", "lseg", "macaddr", "money", "name",
  "numeric", "oid", "oidvector", "path", "pg_lsn", "pg_node_tree", "point",
  "polygon", "refcursor", "regclass", "regconfig", "regdictionary", "regoper",
  "regoperator", "regproc", "regprocedure", "regtype", "reltime", "smgr",
  "text", "tid", "time", "timestamp", "timestamptz", "timetz", "tinterval",
  "tsquery", "tsvector", "txid_snapshot", "unknown", "uuid", "varbit",
  "varchar", "xid", "bit", "bool", "box", "bpchar", "bytea", "char", "cid",
  "cidr", "circle", "date", "float4", "float8", "gtsvector", "inet", "int2",
  "int2vector", "int4", "int8", "interval", "json", "jsonb", "line", "lseg",
  "macaddr", "money", "name", "numeric", "oid", "oidvector", "path", "pg_lsn",
  "pg_node_tree", "point", "polygon", "refcursor", "regclass", "regconfig",
  "regdictionary", "regoper", "regoperator", "regproc", "regprocedure",
  "regtype", "reltime", "smgr", "text", "tid", "time", "timestamp",
  "timestamptz", "timetz", "tinterval", "tsquery", "tsvector",
  "txid_snapshot", "unknown", "uuid", "varbit", "varchar", "xid", "xml"
  ]
$ cabal install text

$ ghc-7.10.1 T10528.hs -c -fforce-recomp -Rghc-timing -O
<<ghc: 186739648 bytes, 94 GCs, 4318989/12392440 avg/max bytes residency (7 samples), 28M in use, 0.001 INIT (0.004 elapsed), 0.332 MUT (0.387 elapsed), 0.328 GC (0.395 elapsed) :ghc>

$ ghc-7.10.2 T10528.hs -c -fforce-recomp -Rghc-timing -O
<<ghc: 2719125976 bytes, 440 GCs, 16808138/34792080 avg/max bytes residency (13 samples), 97M in use, 0.001 INIT (0.001 elapsed), 6.439 MUT (10.141 elapsed), 2.795 GC (4.449 elapsed) :ghc>>

comment:3 Changed 2 years ago by bgamari

Hmm, I had somehow missed this one. Let me have a look.

comment:4 Changed 2 years ago by bgamari

Owner: set to bgamari

comment:5 Changed 2 years ago by bgamari

The program gets substantially larger after the first simplifier pass in 7.10.2: while in 7.10.1 it always stays around 1000 terms or less, in 7.10.2 it grows to 40000 terms at its peak. The issue is reproducible on the master branch as well.

comment:6 Changed 2 years ago by bgamari

It appears that inlining is occurring where there previously was none. For instance, each top-level bindings under 7.10.1 produces (in -ddump-simpl),

T10528.strings181 :: Text
[GblId,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=False,
         WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 50 0}]
T10528.strings181 =
  text-1.2.1.1:Data.Text.Show.unpackCString# "abstime"#

Whereas in 7.10.2 we get this,

T10528.strings_dt62 :: [Char]
[GblId,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=False,
         WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 50 0}]
T10528.strings_dt62 = unpackCString# "abstime"#

Rec {
-- RHS size: {terms: 220, types: 125, coercions: 9}
T10528.strings476 [InlPrag=[0], Occ=LoopBreaker]
  :: forall s1_a3V7.
     Data.Text.Array.MArray s1_a3V7
     -> Int
     -> [Char]
     -> Int#
     -> State# s1_a3V7
     -> (# State# s1_a3V7, Text #)
[GblId, Arity=5, Str=DmdType <L,U(U)><L,U(U)><S,1*U><L,U><L,U>]
T10528.strings476 = ...
end Rec }


-- RHS size: {terms: 2, types: 0, coercions: 0}
T10528.strings121 :: Int
[GblId,
 Caf=NoCafRefs,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 20}]
T10528.strings121 = I# 4#

-- RHS size: {terms: 14, types: 15, coercions: 0}
T10528.strings475
  :: forall s1_a3V7. State# s1_a3V7 -> (# State# s1_a3V7, Text #)
[GblId,
 Arity=1,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [0] 83 0}]
T10528.strings475 =
  \ (@ s1_a3V7) (s2_a3V8 [OS=OneShot] :: State# s1_a3V7) ->
    case newByteArray# @ s1_a3V7 8# s2_a3V8
    of _ [Occ=Dead] { (# ipv_a3Vu, ipv1_a3Vv #) ->
    T10528.strings476
      @ s1_a3V7
      (Data.Text.Array.MArray @ s1_a3V7 ipv1_a3Vv)
      T10528.strings121
      T10528.strings_dt62
      0#
      ipv_a3Vu
    }

-- RHS size: {terms: 2, types: 1, coercions: 0}
T10528.strings474 :: Text
[GblId,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=False,
         WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 20 0}]
T10528.strings474 = runSTRep @ Text T10528.strings475

comment:7 Changed 2 years ago by bgamari

In 7.10.1 and 7.10.2 the following simplifications occur (in chronological order),

  • First 7.10.1 and 7.10.2 both inline Data.Text.$fIsStringText and Data.String.fromString (something like once for each binding)
  • In the next simplifier run they both inline GHC.Base.build and GHC.Base.pack (again, something like once per binding)
  • In the next simplifier iteration 7.10.1 does no inlining. 7.10.2, however, keeps going, inlining Data.Text.Internal.Fusion.unstream, Data.Text.Internal.Fusion.Common.map, Data.Text.Internal.Fusion.Common.streamList, Data.Text.Internal.safe, and Data.Text.Internal.Fusion.Types.$WYield (again, proportional to the number of bindings).
Last edited 2 years ago by bgamari (previous) (diff)

comment:8 Changed 2 years ago by simonpj

Right, so after your second bullet (end of phase 1) we have something reasonable (this is for a list with two strings in it):

-- RHS size: {terms: 6, types: 1, coercions: 0}
a_s3zW :: Text
[LclId,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=False,
         WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 120 0}]
a_s3zW =
  Data.Text.Internal.Fusion.unstream
    (Data.Text.Internal.Fusion.Common.map
       Data.Text.Internal.safe
       (Data.Text.Internal.Fusion.Common.streamList
          @ Char (unpackCString# "abstime"#)))

-- RHS size: {terms: 6, types: 1, coercions: 0}
a_s3zY :: Text
[LclId,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=False,
         WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 120 0}]
a_s3zY =
  Data.Text.Internal.Fusion.unstream
    (Data.Text.Internal.Fusion.Common.map
       Data.Text.Internal.safe
       (Data.Text.Internal.Fusion.Common.streamList
          @ Char (unpackCString# "aclitem"#)))

-- RHS size: {terms: 3, types: 2, coercions: 0}
a_s3zX :: [Text]
[LclId,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 30}]
a_s3zX = : @ Text a_s3zY ([] @ Text)

-- RHS size: {terms: 3, types: 1, coercions: 0}
strings :: [Text]
[LclIdX,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 30}]
strings = : @ Text a_s3zW a_s3zX

The unstream/map/safe/streamList is the result of inlining Data.Text.pack. I suppose it's ok to have that much for every literal string.

But then in the next pass (phase 0) we inline unstream, which is pretty big; and Data.Text.Internal.Fusion.Common.map and Data.Text.Internal.safe and Data.Text.Internal.Fusion.Common.streamList.

All of these are inlined because of an INLINE pragma.

So we are inlining boat-loads of code, under explicit user guidance, but for no purpose. I say this is the fault of the text library, not of GHC! What do you want to happen for literal strings?

I don't know what has changed. Has text changed? I think the same thing should happen with any version of GHC though I have not tried.

comment:9 Changed 2 years ago by rwbarton

There is supposed to be a "Rule fired: TEXT literal" but I see it only with 7.10.1, not my 7.10.1.20150719. Both using text-1.2.1.1.

comment:10 Changed 2 years ago by bgamari

Yep, I just came to this realization as well. The rule in question is here, https://github.com/bos/text/blob/master/Data/Text/Show.hs

Last edited 2 years ago by bgamari (previous) (diff)

comment:11 Changed 2 years ago by bgamari

In the simplifier pass right before the two compilers diverge the list of text values looks something like,

(c_d32a
   (Data.Text.pack
      (GHC.Base.build
         @ GHC.Types.Char
         (\ (@ b) ->
            GHC.CString.unpackFoldrCString#
              @ b "bpchar"#)))
   (c_d32a
      (Data.Text.pack
         (GHC.Base.build
            @ GHC.Types.Char
            (\ (@ b) ->
               GHC.CString.unpackFoldrCString#
                 @ b "bytea"#)))
...

Into this GHC.Base.build is inlined,

Considering inlining: build
  arg infos [ValueArg]
  interesting continuation RuleArgCtxt
  some_benefit True
  is exp: True
  is work-free: True
  guidance ALWAYS_IF(arity=1,unsat_ok=False,boring_ok=False)
  ANSWER = YES
Inlining done: GHC.Base.build
    Inlined fn:  \ (@ a)
                   (g [Occ=Once!] :: forall b. (a -> b -> b) -> b -> b) ->
                   g @ [a] (GHC.Types.: @ a) (GHC.Types.[] @ a)

and then pack,

Considering inlining: pack
  arg infos [NonTrivArg]
  interesting continuation BoringCtxt
  some_benefit True
  is exp: True
  is work-free: True
  guidance ALWAYS_IF(arity=0,unsat_ok=False,boring_ok=False)
  ANSWER = YES
Inlining done: Data.Text.pack
    Inlined fn:  \ (x [Occ=Once] :: GHC.Base.String) ->
                   Data.Text.Internal.Fusion.unstream
                     (Data.Text.Internal.Fusion.Common.map
                        Data.Text.Internal.safe
                        (Data.Text.Internal.Fusion.Common.streamList @ GHC.Types.Char x))

Finally the unpack-list rule will rewrite unpackFoldrCString# to unpackCString# which I believe is then supposed to cause this rule in Data.Text.Show to fire,

{-# RULES "TEXT literal" forall a.
    unstream (S.map safe (S.streamList (GHC.unpackCString# a)))
      = unpackCString# a #-}

Yet this appears not to happen in 7.10.2.

Last edited 2 years ago by bgamari (previous) (diff)

comment:12 Changed 2 years ago by bgamari

Thanks to the wonders of git bisect run I was able to determine the first bad commit (on the ghc-7.10 branch) to be 8af219adb914b292d0f8c737fe0a1e3f7fb19cf3, "Fix a huge space leak in the mighty Simplifier".

comment:13 Changed 2 years ago by simonpj

Aha! RULE TEXT literal is terribly fragile:

  • In GHC.Base we have
    {-# RULES
    "unpack"       [~1] forall a   . unpackCString# a             = build (unpackFoldrCString# a)
    "unpack-list"  [1]  forall a   . unpackFoldrCString# a (:) [] = unpackCString# a
    
  • build inlines in phase 1

So, transformation goes like this:

Start with
   fromString (unpackCString# "blah"#)

---> (simplify: gentle phase)  fire rule "unpack"

   fromString (build (unpackFoldrCString# "blah"#))

---> (simplify: phase2)  inline fromString

   Data.Text.pack (build (unpackFoldrCString# "blah"#))

---> (simplify: phase1)  inline pack, build, fire rule "unpack-list"

   unstream (map safe (streamList (unpackCString# "blah"#)))

Now you'd think that rule "TEXT literal" would now fire. But its LHS too has been rewritten by RULE "unpack" to

"TEXT literal" [ALWAYS] forall a :: Addr#
  unstream (map
              safe
              (streamList
                 @ Char
                 (build @ Char (\ @ b -> unpackFoldrCString# @ b a))))
  = unpackCString# a

You can see this by doing ghc --show-iface Data/Text/Show.hi, incidentally.

Why did that happen? Arguably it's a bug: we should not rewrite the LHS of a rule with other rules, any more than we should inline functions on the LHS of a rule.

But it betrays a potential flaw in the rule setup. Consider a source-program expression (unnstream (map safe (streamList (unpackCString# "foo")))). Since there is a rule for unpackCString#, there is no guarantee that rule TEXT literal will fire before rule unpack. In this case we know that the unpackCString# call will eventually be rewritten back into unpackCString#. But it would be better to express that directly by saying

{-# RULES [2] "TEXT literal" forall a.
    unstream (S.map safe (S.streamList (GHC.unpackCString# a)))
      = unpackCString# a #-}

Now the two rules do not compete, and all will be good.

So there are several things to say here

  • GHC probably should do no inlining and no rule rewriting on LHS of rules. I'll fix that.
  • GHC already warns when an inlining competes with a rule; but it should also warn if a rule competes with a rule. I'll fix that too.
  • To eliminate the warning, rule "TEXT literal" should be fixed as above.

comment:14 Changed 2 years ago by bgamari

Ahh, I should have thought to check the interface file! That being said, I never would have expected the rewrite of the LHS.

I'll open a pull request against text fixing the rule.

comment:15 Changed 2 years ago by Simon Peyton Jones <simonpj@…>

In bc4b64ca/ghc:

Do not inline or apply rules on LHS of rules

This is the right thing to do anyway, and fixes Trac #10528

comment:16 Changed 2 years ago by Simon Peyton Jones <simonpj@…>

In 2d88a531/ghc:

Improve warnings for rules that might not fire

Two main things here

* Previously we only warned about the "head" function of the rule,
  but actually the warning applies to any free variable on the LHS.

* We now warn not only when one of these free vars can inline, but
  also if it has an active RULE (c.f. Trac #10528)

See Note [Rules and inlining/other rules] in Desugar

This actually shows up quite a few warnings in the libraries, notably
in Control.Arrow, where it correctly points out that rules like
    "compose/arr"   forall f g .
                    (arr f) . (arr g) = arr (f . g)
might never fire, because the rule for 'arr' (dictionary selection)
might fire first.  I'm not really sure what to do here; there is some
discussion in Trac #10595.

A minor change is adding BasicTypes.pprRuleName to pretty-print RuleName.

comment:17 Changed 2 years ago by simonpj

The commit in comment:15 fixes this regression, regardless of the change to the "TEXT literal" rule

I'll leave this open because

  • We might want to merge to 7.10.3
  • There's still an open question about rules for class methods (see comment:16)

Simon

comment:18 Changed 2 years ago by bgamari

Status: newmerge

text-1.2.1.2 includes phase control specifications on the literal rewrite rules which should prevent the rule competition described by simonpj in comment:13.

I'll mark this as status merge in case we end up doing a 7.10.3 release. This should now be fixed in master. See #10595 for the remaining issues of rewrite rule ordering on class functions.

Last edited 2 years ago by bgamari (previous) (diff)

comment:19 Changed 2 years ago by jakewheat

I tried with text-1.2.1.2:

with ghc 7.10.1:

real	1m20.054s
user	2m3.324s
sys	0m51.684s

with ghc 7.10.2:

real	10m32.169s
user	13m39.756s
sys	4m18.856s

It is still slow - is this expected?

comment:20 Changed 2 years ago by bgamari

That is most certainly not expected, contrary to my previous measurements, and very concerning. I will need to investigate. Thanks for mentioning this.

comment:21 Changed 2 years ago by jakewheat

I wasn't able to see the speed up with the reduced test case code above either:

ghc 7.10.2, text 1.2.1.1

time ghc Test.hs -c -fforce-recomp -Rghc-timing -O -package-db .cabal-sandbox/x86_64-linux-ghc-7.10.2-packages.conf.d/
<<ghc: 2715366976 bytes, 544 GCs, 17352140/46388280 avg/max bytes residency (13 samples), 111M in use, 0.000 INIT (0.001 elapsed), 2.000 MUT (1.892 elapsed), 0.844 GC (1.033 elapsed) :ghc>>

real	0m2.958s
user	0m2.908s
sys	0m0.048s

ghc 7.10.2, text 1.2.1.2

time ghc Test.hs -c -fforce-recomp -Rghc-timing -O -package-db .cabal-sandbox/x86_64-linux-ghc-7.10.2-packages.conf.d/
<<ghc: 2725083744 bytes, 576 GCs, 16625100/31660992 avg/max bytes residency (14 samples), 87M in use, 0.000 INIT (0.001 elapsed), 1.948 MUT (1.885 elapsed), 0.876 GC (1.034 elapsed) :ghc>>

real	0m2.954s
user	0m2.892s
sys	0m0.060s

ghc 7.10.1, text 1.2.1.2

time ghc Test.hs -c -fforce-recomp -Rghc-timing -O -package-db .cabal-sandbox/x86_64-linux-ghc-7.10.1-packages.conf.d/
<<ghc: 184198016 bytes, 98 GCs, 5557874/16933408 avg/max bytes residency (7 samples), 36M in use, 0.000 INIT (0.001 elapsed), 0.140 MUT (0.120 elapsed), 0.112 GC (0.138 elapsed) :ghc>>

real	0m0.289s
user	0m0.280s
sys	0m0.004s

comment:22 Changed 2 years ago by bgamari

Owner: bgamari deleted
Status: mergenew

Unfortunately for reasons I don't yet understand, my testcase was not reliably reproducing the issue, which led me to believe that the phase control annotations resolved the issue. Sadly, it seems that this is not the case.

Simon, I am a bit concerned that the phase control annotations aren't actually sufficient to resolve this issue. The problem being that the rules are being rewritten during the compilation of the module which defines them (Data.Text.Show); consequently the rules making it in to the interface file are still being rewritten to,

"TEXT literal" [2] forall a :: Addr#
  unstream (map
              safe
              (streamList
                 @ Char
                 (build @ Char (\ @ b -> unpackFoldrCString# @ b a))))
  = unpackCString# a

Perhaps I misunderstood: did you not expect this issue to be fixed via phase control?

It seems that the only solution here is to fix the underlying issue: ensure that the LHS is not rewritten by merging the fix from comment:15.

comment:23 Changed 2 years ago by bgamari

Arg, it appears that my validation script was running against the wrong GHC tree when I went to validate the text change, hence the incorrect conclusion that it resolved the issue. A very unfortunate mistake.

comment:24 Changed 2 years ago by simonpj

When rewriting in a rule (or unfolding), GHC sets the phase to the activation phase of the rule (or unfolding); in this case [2]. Now, from comment:13, it was clear that I believed that rule unpack was switched off before (the new) TEXT literal was switched on. But I was wrong: unpack is active anytime before phase 1, and hence is active in phase 2; so the two still compete.

I should have said

{-# RULES [1] "TEXT literal" forall a.
    unstream (S.map safe (S.streamList (GHC.unpackCString# a)))
      = unpackCString# a #-}

Now TEXT literal won't be active until phase 1, by which time unpack is switched off.

Try that. How annoying.

comment:25 Changed 2 years ago by simonpj

When rewriting in a rule (or unfolding), GHC sets the phase to the activation phase of the rule (or unfolding);"

what does "the phase" refer to?

The simplifier has an ambient phase. We just set the ambient phase before rewriting in unfoldings and rewrite rules. See Note [Simplifying inside stable unfoldings] in SimplUtils

comment:26 Changed 2 years ago by bgamari

I see, while rewriting inside a rule or unfolding the simplifier is essentially pretending it is in the phase specified by the activation phase of the thing being rewritten.

comment:27 Changed 2 years ago by simonpj

Yes!

comment:28 Changed 2 years ago by bgamari

Resolution: fixed
Status: newclosed

This should now actually be resolved with text 1.2.1.3.

comment:29 Changed 2 years ago by thomie

Resolution: fixed
Status: closednew

comment:30 in reply to:  17 Changed 2 years ago by thomie

Status: newmerge

Reopening because the commit from comment:15 hasn't been merged yet.

The commit in comment:15 fixes this regression, regardless of the change to the "TEXT literal" rule

I'll leave this open because

  • We might want to merge to 7.10.3
  • There's still an open question about rules for class methods (see comment:16)

Simon

comment:31 Changed 2 years ago by simonpj

I think the commit in comment:15 is optional. It's probably a good idea anyway, but any library that needs it probably needs more phase control on its rules!

comment:32 Changed 2 years ago by afarmer

Please do merge the commit referenced in comment:15!

HERMIT (for better or worse) still relies on rules pragmas to specify equational properties. The "fix huge space leak" commit mentioned in comment:12 effectively prevents us from using HERMIT with 7.10.2 because the left-hand sides of rules are being rewritten before they get to HERMIT. (Apologies for not spotting this earlier... I've been traveling this summer.)

comment:33 Changed 2 years ago by RyanGlScott

Cc: RyanGlScott added

comment:34 Changed 2 years ago by afarmer

I cherry-picked bc4b64ca5b99bff6b3d5051b57cb2bc52bd4c841 onto the 7.10.2 release branch and built it to test HERMIT. It does fix the problem we were having with the LHS of rules being altered by the simplifier, which is good. However, the inlining/rule application still appears to be happening in the RHS of the rules, which wasn't expected. Looking at the patch, I see the simplifier environment for the RHS still allows inlining/rule application.

Can we change the simplification of the RHS to also not inline/apply rules? (I'm happy to submit a patch that does this.)

I assume doing so in the RHS saves redundant simplifier work, but (even outside HERMIT's use) the unexpected RHSs make it difficult to glue together rules into rewrite systems. For instance, given the following dummy rule set:

"foo" forall x. foo x = bar (baz x)
"baz-quux" forall y. baz (quux y) = norf y

If this expression appeared:

foo (quux z)

I would expect it to be rewritten to:

bar (baz (quux z))

then:

bar (norf z)

But, if baz is inlined in the RHS of rule "foo", then rule "baz-quux" may never get to fire on the result of "foo" applications. I realize I could be careful about assigning phases to the inline pragma on `baz' and both rules, but that seems extremely fiddly compared to just not inlining/rewriting the RHS to begin with.

Version 1, edited 2 years ago by afarmer (previous) (next) (diff)

comment:35 in reply to:  34 ; Changed 2 years ago by simonpj

Replying to afarmer:

I realize I could be careful about assigning phases to the inline pragma on baz and both rules, but that seems extremely fiddly compared to just not inlining/rewriting the RHS to begin with.

It may be fiddly but you absolutely must do it. In the rewrite sequence you give, the inlining for baz is active, so GHC could perfectly well rewrite

bar (baz (quux z))  --->   bar (<baz's inlining with (quux z) substituted>)

What makes you think that rule "baz-quux" will get there first? Even if it usually does, some trivial thing might mean the rule missed on the first go, and then you'd inline baz. We initially tried to guarantee that if rules and inlinings were both active, the rules would "win" but it was flaky and unreliable. Hence phase control.

If you want robust, predictable behaviour, you should use phases. Sorry! (And then there's no reason not to inline/apply rules in the RHS.)

comment:36 Changed 2 years ago by simonpj

Status is 'merge' because we should merge the patch as described in comment:34 into 7.10.3, if/when we release it.

comment:37 in reply to:  35 Changed 2 years ago by afarmer

Ah, so much for my attempt to make my request seem like a non-HERMIT issue. :-P

We would still like to have access to the original version of the rule (without inlining/rule application being performed on either side) for HERMIT's purposes. Is that possible? The rewriting of the RHS seems to happen even if HERMIT is the very first phase of the core2core pipeline, so I take it that it happens during desugaring? Might it be possible to delay until the first simplifier pass runs?

To be concrete, I was proposing the following patch:

diff --git a/compiler/simplCore/Simplify.hs b/compiler/simplCore/Simplify.hs
index d816d3f..f9e3f92 100644
--- a/compiler/simplCore/Simplify.hs
+++ b/compiler/simplCore/Simplify.hs
@@ -2969,10 +2969,9 @@ simplRules env mb_new_nm rules
                           , ru_fn = fn_name, ru_rhs = rhs
                           , ru_act = act })
       = do { (env, bndrs') <- simplBinders env bndrs
-           ; let lhs_env = updMode updModeForRuleLHS env
-                 rhs_env = updMode (updModeForStableUnfoldings act) env
-           ; args' <- mapM (simplExpr lhs_env) args
-           ; rhs'  <- simplExpr rhs_env rhs
+           ; let env' = updMode updModeForRuleLHS env
+           ; args' <- mapM (simplExpr env') args
+           ; rhs'  <- simplExpr env' rhs
            ; return (rule { ru_bndrs = bndrs'
                           , ru_fn    = mb_new_nm `orElse` fn_name
                           , ru_args  = args'

which gets our testsuite passing again.

I realize HERMIT support is probably not on GHC's critical path ;-) ... I'm just trying to find a work around so we can support GHC 7.10. Thanks for your help!

comment:38 Changed 2 years ago by afarmer

As an example of what is happening, we are seeing rules like this:

repH :: [a] -> [a] -> [a]

{-# RULES "repH ++" [~] forall xs ys. repH (xs ++ ys) = repH xs . repH ys #-}

get to HERMIT as:

"repH ++" forall xs ys. repH (xs ++ ys) = let f = repH xs
                                              g = repH ys 
                                          in \ z -> f (g z)

In this case it is just an unfolding of composition, but some rules get rather gross on the RHS. The extra junk makes equational reasoning with these rules very fiddly and sort of breaks the correspondence with the source-level code.

Some possible solutions that occur to me:

  1. If we could get access to the rules before this inlining/rule application in the RHS occurs, then we wouldn't care what happens to them later. So just delaying things a bit.
  1. If we don't inline/apply rules in the RHS at all (as above).
  1. If we don't inline/apply rules in the RHS of rules marked with the NeverActive notation (rules intended for HERMIT use are generally NeverActive).
  1. We implement our own concrete syntax for HERMIT rules and stop abusing RULES pragmas. I realize this is probably the "best" solution, but it's also an API-breaker that I'm guessing would land in 7.12 at the earliest.

Would any of those be acceptable/possible?

Last edited 2 years ago by afarmer (previous) (diff)

comment:39 Changed 2 years ago by afarmer

Proposed patch that implements Option 3 in my list above... which I think is the best short term solution.

diff --git a/compiler/simplCore/Simplify.hs b/compiler/simplCore/Simplify.hs
index d816d3f..0e4ca50 100644
--- a/compiler/simplCore/Simplify.hs
+++ b/compiler/simplCore/Simplify.hs
@@ -38,7 +38,7 @@ import CoreArity
 --import PrimOp           ( tagToEnumKey ) -- temporalily commented out. See #8326
 import Rules            ( mkSpecInfo, lookupRule, getRules )
 import TysPrim          ( voidPrimTy ) --, intPrimTy ) -- temporalily commented out. See #8326
-import BasicTypes       ( TopLevelFlag(..), isTopLevel, RecFlag(..) )
+import BasicTypes       ( TopLevelFlag(..), isTopLevel, RecFlag(..), Activation(..) )
 import MonadUtils       ( foldlM, mapAccumLM, liftIO )
 import Maybes           ( orElse )
 --import Unique           ( hasKey ) -- temporalily commented out. See #8326
@@ -2970,7 +2970,9 @@ simplRules env mb_new_nm rules
                           , ru_act = act })
       = do { (env, bndrs') <- simplBinders env bndrs
            ; let lhs_env = updMode updModeForRuleLHS env
-                 rhs_env = updMode (updModeForStableUnfoldings act) env
+                 rhs_env = case act of
+                            NeverActive -> lhs_env
+                            _ -> updMode (updModeForStableUnfoldings act) env
            ; args' <- mapM (simplExpr lhs_env) args
            ; rhs'  <- simplExpr rhs_env rhs
            ; return (rule { ru_bndrs = bndrs'

Since NeverActive rules are not actually applied by GHC, not rewriting their RHSs shouldn't change anything for real libraries, while still offering HERMIT access to the original rule without the extra inlining/rule application. This patch also allows our test suite to pass.

comment:40 Changed 2 years ago by simonpj

Can you make a new ticket, about this issue (it's nothing to do with #10528), and explain there why you want the change.

I see no compelling reason why we can't switch off rules and inlinings in the RHS just as we do in the LHS. We'd just need to carefully document the reason for doing so, and having a ticket to refer to is good.

I don't really want to mess with 7.10 on this.... I don't think there would be knock on effects, but I would bet my life on it

comment:41 Changed 2 years ago by Roboguy

Cc: Roboguy added

comment:42 Changed 2 years ago by bgamari

Cc: bgamari added

comment:43 Changed 2 years ago by bgamari

Resolution: fixed
Status: mergeclosed

"Do not inline or apply rules on LHS of rules" has been merged to ghc-7.10. This closes this issue.

Rewriting of rule RHSs is being tracked in #10829.

Last edited 2 years ago by bgamari (previous) (diff)
Note: See TracTickets for help on using tickets.