Opened 10 years ago

Closed 5 years ago

#1904 closed bug (fixed)

Strictness analyser should be smarter

Reported by: igloo Owned by:
Priority: normal Milestone:
Component: Compiler Version: 6.8.1
Keywords: Cc: merehap@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


This module:

module Q (foo) where
foo :: String -> [String] -> Bool
#ifdef FIRST
foo x _ | x `seq` x == "." = True
foo x _ |         x == "." = True
foo x xs = x `seq` any (x ==) xs

should be compiled to the same code regardless of whether or not FIRST is defined. However, this is not the case; there is an extra case when FIRST is not defined:

$ ghc --version                                              
The Glorious Glasgow Haskell Compilation System, version 6.8.1
$ ghc -fforce-recomp -ddump-simpl -cpp -O -c Q.hs -DFIRST > 1
$ ghc -fforce-recomp -ddump-simpl -cpp -O -c Q.hs > 2
$ diff -U 1000 1 2
--- 1   2007-11-17 21:02:37.000000000 +0000
+++ 2   2007-11-17 21:02:39.000000000 +0000
@@ -1,33 +1,35 @@
 ==================== Tidy Core ====================
 Q.a :: GHC.Base.Char
  Str: DmdType m]
 Q.a = GHC.Base.C# '.'
 Q.lvl :: [GHC.Base.Char]
  Str: DmdType]
 Q.lvl = GHC.Base.: @ GHC.Base.Char Q.a (GHC.Base.[] @ GHC.Base.Char) :: GHC.Base.String -> [GHC.Base.String] -> GHC.Base.Bool
 [Arity 2
  Str: DmdType SL] =
-  \ (x_a5D :: GHC.Base.String) (ds_d6y :: [GHC.Base.String]) ->
-    case GHC.Base.eqString x_a5D Q.lvl of wild_Xc {
+  \ (x_a5D :: GHC.Base.String) (ds_d6w :: [GHC.Base.String]) ->
+    case GHC.Base.eqString x_a5D Q.lvl of wild_B1 {
       GHC.Base.False ->
-       GHC.List.any @ GHC.Base.String (GHC.Base.eqString x_a5D) ds_d6y;
+       case x_a5D of tpl_Xd { __DEFAULT ->
+       GHC.List.any @ GHC.Base.String (GHC.Base.eqString tpl_Xd) ds_d6w
+       };
       GHC.Base.True -> GHC.Base.True
 ==================== Tidy Core Rules ====================

There is a little discussion in the thread where this was first reported:

Change History (6)

comment:1 Changed 10 years ago by simonpj

Summary: strictness analyser should be smarterStrictness analyser should be smarter

Here is my analysis copied from the thread (initially I missed the link, and reconstructed them in my head, so I thought it'd be worth dumping them here).


When GHC decides that 'foo' is strict, it does not require that every caller must evaluate the argument before the call. So GHC does not assume that foo always gets an evaluated argument. Indeed, if GHC sees the call (foo v), where foo is strict, it does not evaluate v before the call. Nor in the case of (map foo xs).

When GHC sees

	case x of
	    pat -> .....(case x of DEFAULT -> e)...

it removes the (case x of DEFAULT ->), because it can "see" that x is already evaluated. [A 'seq' turns into one of those (case x of DEFAULT) things.] This transformation is done by the simplifier, which keeps an environment mapping variables to information about their degree of evaluation.

However, if the outer case was

	case (f x) of
	   pat  -> ...

then GHC does not record anything about x, even if f is strict. I guess it could, by making use of f's strictness information.

But then what about

	case (f (g x)) of ...

where both f and g are strict? Etc.

One could imagine either

  • Beefing up the existing simplifier optimisation in a relatively ad hoc way.

Advantage: applies even if the (case x of DEFAULT) thing shows up much later

Disadvantage: probably a bit ad hoc, and the simplifier is already v complicated

  • Adding a new top-down sweep to the strictness analyser. Actually it could be done separately from strictness analysis, but would use much of the same code (e.g. "what demand is placed on x by evaluating (f (g x))?").

Advantage: doesn't complicated the already-complex simplifier. The new pass ("case optimisation", perhaps) could remove some stuff from the too-complex simplifier

Disadvantage: a new pass

I'm not terribly inclined to do this myself, unless someone convinces me that there are big gains to be had, but I'd be happy to support someone who already knows GHC a bit (Ian?) if you want to have a go.

comment:2 Changed 9 years ago by simonmar

Architecture: UnknownUnknown/Multiple

comment:3 Changed 9 years ago by simonmar

Operating System: UnknownUnknown/Multiple

comment:4 Changed 7 years ago by merehap

Cc: merehap@… added
Type of failure: None/Unknown

comment:5 Changed 5 years ago by morabbin

Does the second approach above amount to:

When we first encounter a case, calculate the demand placed on vars in the scrutinee, when evaluating the scrutinee. Use this information when analyzing the branches.

comment:6 Changed 5 years ago by simonpj

Resolution: fixed
Status: newclosed

The code now looks good. Here is Case A, with -ddump-simpl -O output:

foo :: String -> [String] -> Bool
foo x _ | x `seq` x == "." = True
foo x xs                   = x `seq` any (x ==) xs

===== =
  \ (x_aeG :: GHC.Base.String) (ds_dl8 :: [GHC.Base.String]) ->
    case x_aeG of x1_XeK { __DEFAULT ->
    case GHC.Base.eqString x1_XeK T1904.foo1 of _ {
      GHC.Types.False ->
        GHC.List.any @ GHC.Base.String (GHC.Base.eqString x1_XeK) ds_dl8;
      GHC.Types.True -> GHC.Types.True
    } }

Now without the first seq, Case B:

foo :: String -> [String] -> Bool
foo x _ | x == "." = True
foo x xs           = x `seq` any (x ==) xs

==== =
  \ (x_aeG :: GHC.Base.String) (ds_dl7 :: [GHC.Base.String]) ->
    case GHC.Base.eqString x_aeG T1904.foo1 of _ {
      GHC.Types.False ->
        case x_aeG of x1_akh { __DEFAULT ->
        GHC.List.any @ GHC.Base.String (GHC.Base.eqString x1_akh) ds_dl7
      GHC.Types.True -> GHC.Types.True

Notice that both have one remaining seq on x, but in one case it's before the call to eqString, and in the other case after it, which is as it should be.

So I'm happy here, and am closing. Re-open if you disagree.

Note: See TracTickets for help on using tickets.