Opened 3 years ago

Last modified 2 months ago

#7206 new bug

Implement cheap build

Reported by: simonpj Owned by: simonpj
Priority: normal Milestone:
Component: Compiler Version: 7.4.2
Keywords: Cc: mail@…, carter.schonwald@…, snoyberg, edsko, ekmett
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

We sometimes see stuff like this:

f n ps = let ys = [1..x]
         in map (\zs. ys ++ zs) ps

You might think the (++) would fuse with the [1..x], via foldr/build fusion, but it doesn't. Why not? Because it would be WRONG to do so in this case:

f ns ps = let ys = map expensive ns
          in map (\zs. ys ++ zs) ps

If we fused the (++) with the map we might call expensive once for each element of ps.

This is fairly easy to fix. The point is that [1..x] is cheap; we'd prefer to fuse it even if doing so involves computing 1, 1+1, 2+1, etc multiple times. Suppose we express this fact thusly:

enumFromTo lo hi = cheapBuild (\cn. ....lo...hi...)
map f xs = build (\cn. ...f...xs...)

Now we want the foldr/cheapBuild rule to fire even if that would involve duplicating the call to cheapBuild. And we already have a way to do that: we make cheapBuild into a CONLIKE function.

Happily it's almost all simply a change to the libraries, not the compiler itself.

I just need to execute on this, but I keep failing to get round to it. Below is the beginning. One missing piece is that I need to replace the hack for build in the occurrence analyser, so that it works for cheapBuild too. (At least until we have Ilya's cardinality analyser.)

Simon

diff --git a/GHC/Base.lhs b/GHC/Base.lhs
index 6a36eb5..b78edf5 100644
--- a/GHC/Base.lhs
+++ b/GHC/Base.lhs
@@ -304,6 +304,12 @@ build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]

 build g = g (:) []

+cheapBuild   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
+{-# INLINE CONLIKE [1] cheapBuild #-}
+-- cheapBuild is just like build, except that it is CONLIKE
+-- See Note [cheapBuild]
+cheapBuild g = g (:) []
+
-- | A list producer that can be fused with 'foldr'.
-- This function is merely
--
@@ -320,6 +326,8 @@ augment g xs = g (:) xs
{-# RULES
"fold/build"    forall k z (g::forall b. (a->b->b) -> b -> b) . 
                 foldr k z (build g) = g k z
+"fold/cheapBuild"    forall k z (g::forall b. (a->b->b) -> b -> b) . 
+                     foldr k z (cheapBuild g) = g k z

 "foldr/augment" forall k z xs (g::forall b. (a->b->b) -> b -> b) . 
                 foldr k z (augment g xs) = g k (foldr k z xs)
@@ -343,6 +351,12 @@ augment g xs = g (:) xs
"augment/build" forall (g::forall b. (a->b->b) -> b -> b)
                        (h::forall b. (a->b->b) -> b -> b) .
                        augment g (build h) = build (\c n -> g c (h c n))
+
+"augment/cheapBuild" forall (g::forall b. (a->b->b) -> b -> b)
+                       (h::forall b. (a->b->b) -> b -> b) .
+                       augment g (cheapBuild h) = build (\c n -> g c (h c n))
+   -- 'augment' doesn't necessarily have a cheap argument, so we revert to 'build'
+
"augment/nil"   forall (g::forall b. (a->b->b) -> b -> b) .
                         augment g [] = build g
  #-}
@@ -351,6 +365,20 @@ augment g xs = g (:) xs
--      augment g (augment h t) = augment (\cn -> g c (h c n)) t
\end{code}

+Note [cheapBuild]
+~~~~~~~~~~~~~~~~~
+cheapBuild is just like build, except that it is CONLIKE
+
+It is used in situations where fusion is more imortant than sharing,
+ie in situation where its argument function 'g' in (cheapBuild g) is
+cheap.
+
+Main example: enumerations of one kind or another:
+    f x = let xs = [x..] 
+              go = \y. ....go y'....(map (h y) xs)...
+          in ...
+Here we woud like to fuse the map with the [x..]
+

 ----------------------------------------------
--              map     
@@ -831,7 +859,7 @@ a `iShiftRL#` b | b >=# WORD_SIZE_IN_BITS# = 0#

 -- Rules for C strings (the functions themselves are now in GHC.CString)
{-# RULES
-"unpack"       [~1] forall a   . unpackCString# a             = build (unpackFoldrCString# a)
+"unpack"       [~1] forall a   . unpackCString# a             = cheapBuild (unpackFoldrCString# a)
"unpack-list"  [1]  forall a   . unpackFoldrCString# a (:) [] = unpackCString# a
"unpack-append"     forall a n . unpackFoldrCString# a (:) n  = unpackAppendCString# a n

diff --git a/GHC/Enum.lhs b/GHC/Enum.lhs
index cea3ced..561a995 100644
--- a/GHC/Enum.lhs
+++ b/GHC/Enum.lhs
@@ -376,9 +376,9 @@ instance  Enum Char  where
     enumFromThenTo (C# x1) (C# x2) (C# y) = efdtChar (ord# x1) (ord# x2) (ord# y)

 {-# RULES
-"eftChar"       [~1] forall x y.        eftChar x y       = build (\c n -> eftCharFB c n x y)
-"efdChar"       [~1] forall x1 x2.      efdChar x1 x2     = build (\ c n -> efdCharFB c n x1 x2)
-"efdtChar"      [~1] forall x1 x2 l.    efdtChar x1 x2 l  = build (\ c n -> efdtCharFB c n x1 x2 l)
+"eftChar"       [~1] forall x y.        eftChar x y       = cheapBuild (\c n -> eftCharFB c n x y)
+"efdChar"       [~1] forall x1 x2.      efdChar x1 x2     = cheapBuild (\ c n -> efdCharFB c n x1 x2)
+"efdtChar"      [~1] forall x1 x2 l.    efdtChar x1 x2 l  = cheapBuild (\ c n -> efdtCharFB c n x1 x2 l)
"eftCharList"   [1]  eftCharFB  (:) [] = eftChar
"efdCharList"   [1]  efdCharFB  (:) [] = efdChar
"efdtCharList"  [1]  efdtCharFB (:) [] = efdtChar
@@ -510,7 +510,7 @@ instance  Enum Int  where
-- In particular, we have rules for deforestation

 {-# RULES
-"eftInt"        [~1] forall x y. eftInt x y = build (\ c n -> eftIntFB c n x y)
+"eftInt"        [~1] forall x y. eftInt x y = cheapBuild (\ c n -> eftIntFB c n x y)
"eftIntList"    [1] eftIntFB  (:) [] = eftInt
  #-}

@@ -539,7 +539,7 @@ eftIntFB c n x0 y | x0 ># y    = n

 {-# RULES
"efdtInt"       [~1] forall x1 x2 y.
-                     efdtInt x1 x2 y = build (\ c n -> efdtIntFB c n x1 x2 y)
+                     efdtInt x1 x2 y = cheapBuild (\ c n -> efdtIntFB c n x1 x2 y)
"efdtIntUpList" [1]  efdtIntFB (:) [] = efdtInt
  #-}

@@ -646,8 +646,8 @@ instance  Enum Integer  where
     enumFromThenTo x y lim = enumDeltaToInteger x (y-x) lim

 {-# RULES
-"enumDeltaInteger"      [~1] forall x y.  enumDeltaInteger x y     = build (\c _ -> enumDeltaIntegerFB c x y)
-"efdtInteger"           [~1] forall x y l.enumDeltaToInteger x y l = build (\c n -> enumDeltaToIntegerFB c n x y l)
+"enumDeltaInteger"      [~1] forall x y.  enumDeltaInteger x y     = cheapBuild (\c _ -> enumDeltaIntegerFB c x y)
+"efdtInteger"           [~1] forall x y l.enumDeltaToInteger x y l = cheapBuild (\c n -> enumDeltaToIntegerFB c n x y l)
"enumDeltaInteger"      [1] enumDeltaIntegerFB   (:)    = enumDeltaInteger
"enumDeltaToInteger"    [1] enumDeltaToIntegerFB (:) [] = enumDeltaToInteger
  #-}

Attachments (2)

cheap-build-ghc.patch (4.2 KB) - added by simonpj 2 years ago.
Patch for GHC itself
cheap-build-base.patch (4.9 KB) - added by simonpj 2 years ago.
Patch for base library

Download all attachments as: .zip

Change History (14)

comment:1 Changed 3 years ago by simonpj

See also #7309

comment:2 Changed 3 years ago by nomeata

  • Cc mail@… added

comment:3 Changed 3 years ago by carter

  • Cc carter.schonwald@… added

comment:4 Changed 3 years ago by igloo

  • Milestone set to 7.8.1
  • Owner set to simonpj

comment:5 Changed 2 years ago by simonpj

  • Milestone changed from 7.8.1 to _|_

I tried this again, but got disappointing results:

--------------------------------------------------------------------------------
        Program           Size    Allocs   Runtime   Elapsed  TotalMem
--------------------------------------------------------------------------------
           anna          -0.4%     +0.0%     -6.3%     -5.9%     +0.0%
           ansi          -0.2%     -0.6%      0.00      0.00     +0.0%
           atom          -0.1%     +0.0%     +2.0%     +2.0%     +0.0%
         awards          -0.4%     +0.0%      0.00      0.00     +0.0%
         banner          -0.1%     +0.0%      0.00      0.00     +0.0%
     bernouilli          -0.0%     +0.0%     -1.6%     -1.6%     +0.0%
   binary-trees          -0.2%     +0.0%     +4.4%     +4.4%     +0.0%
          boyer          -0.2%     +0.0%      0.07      0.07     +0.0%
         boyer2          -0.5%     +0.0%      0.01      0.01     +0.0%
           bspt          -0.3%     +0.0%      0.02      0.02     +0.0%
      cacheprof          -1.4%     +0.2%     -0.7%     -0.9%     -0.5%
       calendar          -0.2%     +0.0%      0.00      0.00     +0.0%
       cichelli          -0.4%     -0.0%      0.13      0.13     +0.0%
        circsim          -0.2%     +0.0%     -4.6%     -4.7%     +0.0%
       clausify          -0.2%     +0.0%      0.06      0.06     +0.0%
  comp_lab_zift          -0.2%     +0.0%     -2.6%     -2.6%     +0.0%
       compress          -0.3%     +0.0%     +3.0%     +3.7%     +0.0%
      compress2          -0.1%     +0.0%     +1.3%     +1.3%     +0.0%
    constraints          -0.2%     +0.0%     -3.9%     -3.9%     +0.0%
   cryptarithm1          -0.2%     +0.0%     +6.3%     +6.3%     +0.0%
   cryptarithm2          -0.5%     +0.0%      0.02      0.02     +0.0%
            cse          -0.5%     +0.0%      0.00      0.00     +0.0%
          eliza          -0.1%     +0.0%      0.00      0.00     +0.0%
          event          -0.2%     +0.0%     +2.9%     +2.9%     +0.0%
         exp3_8          -0.2%     +0.0%     -2.8%     -2.8%     +0.0%
         expert          -0.1%     +0.0%      0.00      0.00     +0.0%
 fannkuch-redux          -0.1%     +0.0%     -2.4%     -2.4%     +0.0%
          fasta          -0.2%     +0.0%     -0.4%     -0.4%     +0.0%
            fem          -0.0%     +0.0%      0.04      0.04     +0.0%
            fft          -0.1%     +0.0%      0.06      0.06     +0.0%
           fft2          -0.1%     +0.0%      0.10      0.10     +0.0%
       fibheaps          -0.2%     +0.0%      0.05      0.05     +0.0%
           fish          -0.1%     +2.7%      0.03      0.03     +0.0%
          fluid          -0.1%     +0.2%      0.01      0.01   +100.0%
         fulsom          -0.3%     +0.0%     -4.0%     -4.0%     +3.2%
         gamteb          -0.1%     -0.0%      0.07      0.07     +0.0%
            gcd          -0.1%     +0.0%      0.05      0.05     +0.0%
    gen_regexps          -0.1%     +0.0%      0.00      0.00     +0.0%
         genfft          -0.1%     +0.0%      0.06      0.06     +0.0%
             gg          -0.5%     +0.1%      0.02      0.02     +0.0%
           grep          -0.1%     +0.0%      0.00      0.00     +0.0%
         hidden          -0.1%     +0.0%     +4.2%     +4.2%     +0.0%
            hpg          -0.4%     -0.2%     +8.8%     +8.8%     +0.0%
            ida          -0.2%     +0.0%      0.14      0.14     +0.0%
          infer          -0.3%     +0.0%      0.13      0.13     +0.0%
        integer          -0.1%     +0.0%     +0.1%     +0.1%     +0.0%
      integrate          -0.1%     +0.0%    +18.2%    +18.2%     +0.0%
   k-nucleotide          -0.1%     +0.0%     -6.2%     -6.2%     +0.0%
          kahan          -0.1%     +0.0%     +1.1%     +1.1%     +0.0%
        knights          -0.2%     +1.4%      0.01      0.01     +0.0%
           lcss          -0.2%     +0.0%     -3.5%     -3.5%     +0.0%
           life          -0.2%     +0.0%     +2.5%     +2.5%     +0.0%
           lift          -0.3%     +0.0%      0.00      0.00     +0.0%
      listcompr          -0.2%     +0.0%      0.15      0.15     +0.0%
       listcopy          -0.2%     +0.0%      0.16      0.16     +0.0%
       maillist          -0.3%     +0.0%      0.10      0.11     -3.9%
         mandel          -0.0%     -0.0%      0.12      0.12     +0.0%
        mandel2          -0.5%     +0.0%      0.00      0.00     +0.0%
        minimax          -0.3%     +0.0%      0.00      0.00     +0.0%
        mkhprog          -0.2%     +0.0%      0.00      0.00     +0.0%
     multiplier          -0.3%     +0.0%     +1.8%     +1.8%     +0.0%
         n-body          -0.2%     +0.0%     -0.7%     -0.6%     +0.0%
       nucleic2          -0.1%     +0.0%      0.11      0.11     +0.0%
           para          -0.0%     +0.0%     +1.4%     +0.8%     +0.0%
      paraffins          -0.2%     +0.0%    +13.6%    +13.6%     +0.0%
         parser          -1.3%     +3.5%      0.05      0.05     +0.0%
        parstof          -0.3%     +0.0%      0.01      0.01     +0.0%
            pic          -0.1%     +0.1%      0.00      0.00     +0.0%
       pidigits          -0.1%     +0.0%     +1.5%     +1.9%     +0.0%
          power          -0.1%     +0.0%     -4.9%     -4.9%     +0.0%
         pretty          -0.1%     -0.0%      0.00      0.00     +0.0%
         primes          -0.1%     +0.0%      0.11      0.11     +0.0%
      primetest          -0.0%     +0.0%      0.14      0.14     +0.0%
         prolog          -0.2%     +0.1%      0.00      0.17     +0.0%
         puzzle          -0.3%     +0.0%     +1.7%     +1.7%     +0.0%
         queens          -0.2%     +0.0%      0.02      0.02     +0.0%
        reptile          -0.2%     -0.0%      0.02      0.02     +0.0%
reverse-complem          -0.3%     +0.0%     +3.8%     +3.8%     +0.0%
        rewrite          -0.2%     +0.0%      0.02      0.02     +0.0%
           rfib          -0.1%     +0.0%      0.03      0.03     +0.0%
            rsa          -0.0%     +0.0%      0.03      0.12     +0.0%
            scc          -0.2%     +0.0%      0.00      0.00     +0.0%
          sched          -0.2%     +0.0%      0.03      0.03     +0.0%
            scs          -0.0%     +0.1%     -2.9%     -3.1%     +0.0%
         simple          -0.9%     +0.6%    -10.8%    -10.8%    +13.8%
          solid          -0.1%     +0.0%     +3.2%     +3.2%     +0.0%
        sorting          -0.1%     +0.0%      0.00      0.00     +0.0%
  spectral-norm          -0.2%     +0.0%     +0.1%     +0.1%     +0.0%
         sphere          -0.2%     +0.0%      0.08      0.08     +0.0%
         symalg          -0.2%     +0.0%      0.01      0.04     +0.0%
            tak          -0.2%     +0.0%      0.02      0.02     +0.0%
      transform          -0.2%     +0.0%     -0.6%     -0.6%     +0.0%
       treejoin          -0.2%     +0.0%     +0.0%     -0.7%     +0.0%
      typecheck          -0.2%     -0.0%     +0.0%     +0.0%     +0.0%
        veritas          +0.7%     +0.0%      0.00      0.00     +0.0%
           wang          -0.1%     +0.0%     -5.5%     -5.5%     +0.0%
      wave4main          -0.2%     +0.0%     -2.4%     -2.4%     +0.0%
   wheel-sieve1          -0.1%     +0.0%     -0.5%     -0.5%     +0.0%
   wheel-sieve2          -0.1%     +0.0%     -6.4%     -6.4%     +0.0%
           x2n1          -0.2%     +0.0%      0.00      0.00     +0.0%
--------------------------------------------------------------------------------
            Min          -1.4%     -0.6%    -10.8%    -10.8%     -3.9%
            Max          +0.7%     +3.5%    +18.2%    +18.2%   +100.0%
 Geometric Mean          -0.2%     +0.1%     +0.1%     +0.1%     +0.8%

On investigation it seemed that although it was good to fuse away the intermediate lists, we were duplicating the individual Ints or Chars that would otherwise have been shared, and this pushes up allocation. So we often lose a bit and seldom win much.

For example, in spectral/fish we get a lot more calls to GHC.CString.unpackAppendCString#, which allocates not only cons cells (which is fine) but also the characters themselves. Without the cheapBuild stuff we just call (++) on a shared list of characters.

So I'm parking this again. I attach the patches I used.

Simon

Changed 2 years ago by simonpj

Patch for GHC itself

Changed 2 years ago by simonpj

Patch for base library

comment:6 Changed 18 months ago by simonpj

If we do this, don't forget to use cheapBuild for repeat.

Simon

comment:7 Changed 8 months ago by dfeuer

Even more kinds of producers makes for even more fusion rules. If we decide to rewrite all consumers to foldr, I suppose that then opens up the possibility of having multiple producers instead, but expanding on both ends makes for a rule explosion. I'm also trying, on and off, to see if we can get away with writing all these enumerations using unfoldr or similar; I'm not sure how these ideas interact.

An idea that might be crazy: introduce a local annotation indicating that a certain function should be treated as CONLIKE at a particular call site. So then you'd have just {-# CHEAP #-} build blah and leave the fusion rules alone.

comment:8 Changed 8 months ago by simonpj

See also this thread, and this one.

Last edited 8 months ago by simonpj (previous) (diff)

comment:9 Changed 8 months ago by simonpj

Another thought. Regardless of what we do here, it seems likely that GHC will sometimes make the wrong choice. So we should give the programmer a way to make the "right" choice.

Example. Suppose [1..10000] is floated out and shared. Then we should provide duplicableEnumFromTo 1 10000 which means "please don't try to share me; instead fuse me with my consumers, even if that loses sharing". OK so you lose the nice notation, but you get to say what you want.

(And if we switch so that [1..1000] is by-default not-shared, then we should provide a way force it to be shared: nonDuplicableEnumFromTo 1 1000.

Simon

comment:10 Changed 8 months ago by snoyberg

  • Cc snoyberg added

Following up on a separate discussion on the haskell-cafe, I have a possibly related case:

main :: IO ()
main = printLen >> printLen

printLen :: IO ()
printLen = lengthM 0 [1..40000000 :: Int] >>= print

lengthM :: Monad m => Int -> [a] -> m Int
lengthM cnt [] = return cnt
lengthM cnt (_:xs) =
    cnt' `seq` lengthM cnt' xs
  where
    cnt' = cnt + 1

On my system, this takes almost 1.2GB of memory. If I comment out the second call to printLen, it takes 44KB.

comment:11 Changed 8 months ago by edsko

  • Cc edsko added

comment:12 Changed 2 months ago by ekmett

  • Cc ekmett added
Note: See TracTickets for help on using tickets.