GHC: Ticket #7206: Implement cheap build
http://ghc.haskell.org/trac/ghc/ticket/7206
<p>
We sometimes see stuff like this:
</p>
<pre class="wiki">f n ps = let ys = [1..x]
in map (\zs. ys ++ zs) ps
</pre><p>
You might think the <tt>(++)</tt> would fuse with the <tt>[1..x]</tt>, via foldr/build fusion, but it doesn't. Why not? Because it would be WRONG to do so in this case:
</p>
<pre class="wiki">f ns ps = let ys = map expensive ns
in map (\zs. ys ++ zs) ps
</pre><p>
If we fused the <tt>(++)</tt> with the <tt>map</tt> we might call <tt>expensive</tt> once for each element of <tt>ps</tt>.
</p>
<p>
This is fairly easy to fix. The point is that <tt>[1..x]</tt> is cheap; we'd prefer to fuse it even if doing so involves computing 1, 1+1, 2+1, etc multiple times. Suppose we express this fact thusly:
</p>
<pre class="wiki">enumFromTo lo hi = cheapBuild (\cn. ....lo...hi...)
map f xs = build (\cn. ...f...xs...)
</pre><p>
Now we want the <tt>foldr/cheapBuild</tt> rule to fire even if that would involve duplicating the call to <tt>cheapBuild</tt>. And we already have a way to do that: we make <tt>cheapBuild</tt> into a <tt>CONLIKE</tt> function.
</p>
<p>
Happily it's almost all simply a change to the libraries, not the compiler itself.
</p>
<p>
I just need to execute on this, but I keep failing to get round to it. Below is the beginning. One missing piece is that I need to replace the hack for <tt>build</tt> in the occurrence analyser, so that it works for <tt>cheapBuild</tt> too. (At least until we have Ilya's cardinality analyser.)
</p>
<p>
Simon
</p>
<pre class="wiki">diff --git a/GHC/Base.lhs b/GHC/Base.lhs
index 6a36eb5..b78edf5 100644
--- a/GHC/Base.lhs
+++ b/GHC/Base.lhs
@@ -304,6 +304,12 @@ build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
build g = g (:) []
+cheapBuild :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
+{-# INLINE CONLIKE [1] cheapBuild #-}
+-- cheapBuild is just like build, except that it is CONLIKE
+-- See Note [cheapBuild]
+cheapBuild g = g (:) []
+
-- | A list producer that can be fused with 'foldr'.
-- This function is merely
--
@@ -320,6 +326,8 @@ augment g xs = g (:) xs
{-# RULES
"fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) .
foldr k z (build g) = g k z
+"fold/cheapBuild" forall k z (g::forall b. (a->b->b) -> b -> b) .
+ foldr k z (cheapBuild g) = g k z
"foldr/augment" forall k z xs (g::forall b. (a->b->b) -> b -> b) .
foldr k z (augment g xs) = g k (foldr k z xs)
@@ -343,6 +351,12 @@ augment g xs = g (:) xs
"augment/build" forall (g::forall b. (a->b->b) -> b -> b)
(h::forall b. (a->b->b) -> b -> b) .
augment g (build h) = build (\c n -> g c (h c n))
+
+"augment/cheapBuild" forall (g::forall b. (a->b->b) -> b -> b)
+ (h::forall b. (a->b->b) -> b -> b) .
+ augment g (cheapBuild h) = build (\c n -> g c (h c n))
+ -- 'augment' doesn't necessarily have a cheap argument, so we revert to 'build'
+
"augment/nil" forall (g::forall b. (a->b->b) -> b -> b) .
augment g [] = build g
#-}
@@ -351,6 +365,20 @@ augment g xs = g (:) xs
-- augment g (augment h t) = augment (\cn -> g c (h c n)) t
\end{code}
+Note [cheapBuild]
+~~~~~~~~~~~~~~~~~
+cheapBuild is just like build, except that it is CONLIKE
+
+It is used in situations where fusion is more imortant than sharing,
+ie in situation where its argument function 'g' in (cheapBuild g) is
+cheap.
+
+Main example: enumerations of one kind or another:
+ f x = let xs = [x..]
+ go = \y. ....go y'....(map (h y) xs)...
+ in ...
+Here we woud like to fuse the map with the [x..]
+
----------------------------------------------
-- map
@@ -831,7 +859,7 @@ a `iShiftRL#` b | b >=# WORD_SIZE_IN_BITS# = 0#
-- Rules for C strings (the functions themselves are now in GHC.CString)
{-# RULES
-"unpack" [~1] forall a . unpackCString# a = build (unpackFoldrCString# a)
+"unpack" [~1] forall a . unpackCString# a = cheapBuild (unpackFoldrCString# a)
"unpack-list" [1] forall a . unpackFoldrCString# a (:) [] = unpackCString# a
"unpack-append" forall a n . unpackFoldrCString# a (:) n = unpackAppendCString# a n
diff --git a/GHC/Enum.lhs b/GHC/Enum.lhs
index cea3ced..561a995 100644
--- a/GHC/Enum.lhs
+++ b/GHC/Enum.lhs
@@ -376,9 +376,9 @@ instance Enum Char where
enumFromThenTo (C# x1) (C# x2) (C# y) = efdtChar (ord# x1) (ord# x2) (ord# y)
{-# RULES
-"eftChar" [~1] forall x y. eftChar x y = build (\c n -> eftCharFB c n x y)
-"efdChar" [~1] forall x1 x2. efdChar x1 x2 = build (\ c n -> efdCharFB c n x1 x2)
-"efdtChar" [~1] forall x1 x2 l. efdtChar x1 x2 l = build (\ c n -> efdtCharFB c n x1 x2 l)
+"eftChar" [~1] forall x y. eftChar x y = cheapBuild (\c n -> eftCharFB c n x y)
+"efdChar" [~1] forall x1 x2. efdChar x1 x2 = cheapBuild (\ c n -> efdCharFB c n x1 x2)
+"efdtChar" [~1] forall x1 x2 l. efdtChar x1 x2 l = cheapBuild (\ c n -> efdtCharFB c n x1 x2 l)
"eftCharList" [1] eftCharFB (:) [] = eftChar
"efdCharList" [1] efdCharFB (:) [] = efdChar
"efdtCharList" [1] efdtCharFB (:) [] = efdtChar
@@ -510,7 +510,7 @@ instance Enum Int where
-- In particular, we have rules for deforestation
{-# RULES
-"eftInt" [~1] forall x y. eftInt x y = build (\ c n -> eftIntFB c n x y)
+"eftInt" [~1] forall x y. eftInt x y = cheapBuild (\ c n -> eftIntFB c n x y)
"eftIntList" [1] eftIntFB (:) [] = eftInt
#-}
@@ -539,7 +539,7 @@ eftIntFB c n x0 y | x0 ># y = n
{-# RULES
"efdtInt" [~1] forall x1 x2 y.
- efdtInt x1 x2 y = build (\ c n -> efdtIntFB c n x1 x2 y)
+ efdtInt x1 x2 y = cheapBuild (\ c n -> efdtIntFB c n x1 x2 y)
"efdtIntUpList" [1] efdtIntFB (:) [] = efdtInt
#-}
@@ -646,8 +646,8 @@ instance Enum Integer where
enumFromThenTo x y lim = enumDeltaToInteger x (y-x) lim
{-# RULES
-"enumDeltaInteger" [~1] forall x y. enumDeltaInteger x y = build (\c _ -> enumDeltaIntegerFB c x y)
-"efdtInteger" [~1] forall x y l.enumDeltaToInteger x y l = build (\c n -> enumDeltaToIntegerFB c n x y l)
+"enumDeltaInteger" [~1] forall x y. enumDeltaInteger x y = cheapBuild (\c _ -> enumDeltaIntegerFB c x y)
+"efdtInteger" [~1] forall x y l.enumDeltaToInteger x y l = cheapBuild (\c n -> enumDeltaToIntegerFB c n x y l)
"enumDeltaInteger" [1] enumDeltaIntegerFB (:) = enumDeltaInteger
"enumDeltaToInteger" [1] enumDeltaToIntegerFB (:) [] = enumDeltaToInteger
#-}
</pre>en-usGHChttp://ghc.haskell.org/trac/ghc/chrome/site/ghc_logo.png
http://ghc.haskell.org/trac/ghc/ticket/7206
Trac 1.0.1simonpjMon, 08 Oct 2012 14:03:33 GMT
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:1
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:1
<p>
See also <a class="new ticket" href="http://ghc.haskell.org/trac/ghc/ticket/7309" title="bug: The Ix instance for (,) leaks space in range (new)">#7309</a>
</p>
TicketnomeataMon, 08 Oct 2012 14:15:22 GMTcc set
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:2
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:2
<ul>
<li><strong>cc</strong>
<em>mail@…</em> added
</li>
</ul>
TicketcarterThu, 11 Oct 2012 00:26:34 GMTcc changed
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:3
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:3
<ul>
<li><strong>cc</strong>
<em>carter.schonwald@…</em> added
</li>
</ul>
TicketiglooSat, 20 Oct 2012 22:05:55 GMTowner, milestone set
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:4
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:4
<ul>
<li><strong>owner</strong>
set to <em>simonpj</em>
</li>
<li><strong>milestone</strong>
set to <em>7.8.1</em>
</li>
</ul>
TicketsimonpjWed, 13 Feb 2013 08:56:12 GMTmilestone changed
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:5
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:5
<ul>
<li><strong>milestone</strong>
changed from <em>7.8.1</em> to <em>_|_</em>
</li>
</ul>
<p>
I tried this again, but got disappointing results:
</p>
<pre class="wiki">--------------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
anna -0.4% +0.0% -6.3% -5.9% +0.0%
ansi -0.2% -0.6% 0.00 0.00 +0.0%
atom -0.1% +0.0% +2.0% +2.0% +0.0%
awards -0.4% +0.0% 0.00 0.00 +0.0%
banner -0.1% +0.0% 0.00 0.00 +0.0%
bernouilli -0.0% +0.0% -1.6% -1.6% +0.0%
binary-trees -0.2% +0.0% +4.4% +4.4% +0.0%
boyer -0.2% +0.0% 0.07 0.07 +0.0%
boyer2 -0.5% +0.0% 0.01 0.01 +0.0%
bspt -0.3% +0.0% 0.02 0.02 +0.0%
cacheprof -1.4% +0.2% -0.7% -0.9% -0.5%
calendar -0.2% +0.0% 0.00 0.00 +0.0%
cichelli -0.4% -0.0% 0.13 0.13 +0.0%
circsim -0.2% +0.0% -4.6% -4.7% +0.0%
clausify -0.2% +0.0% 0.06 0.06 +0.0%
comp_lab_zift -0.2% +0.0% -2.6% -2.6% +0.0%
compress -0.3% +0.0% +3.0% +3.7% +0.0%
compress2 -0.1% +0.0% +1.3% +1.3% +0.0%
constraints -0.2% +0.0% -3.9% -3.9% +0.0%
cryptarithm1 -0.2% +0.0% +6.3% +6.3% +0.0%
cryptarithm2 -0.5% +0.0% 0.02 0.02 +0.0%
cse -0.5% +0.0% 0.00 0.00 +0.0%
eliza -0.1% +0.0% 0.00 0.00 +0.0%
event -0.2% +0.0% +2.9% +2.9% +0.0%
exp3_8 -0.2% +0.0% -2.8% -2.8% +0.0%
expert -0.1% +0.0% 0.00 0.00 +0.0%
fannkuch-redux -0.1% +0.0% -2.4% -2.4% +0.0%
fasta -0.2% +0.0% -0.4% -0.4% +0.0%
fem -0.0% +0.0% 0.04 0.04 +0.0%
fft -0.1% +0.0% 0.06 0.06 +0.0%
fft2 -0.1% +0.0% 0.10 0.10 +0.0%
fibheaps -0.2% +0.0% 0.05 0.05 +0.0%
fish -0.1% +2.7% 0.03 0.03 +0.0%
fluid -0.1% +0.2% 0.01 0.01 +100.0%
fulsom -0.3% +0.0% -4.0% -4.0% +3.2%
gamteb -0.1% -0.0% 0.07 0.07 +0.0%
gcd -0.1% +0.0% 0.05 0.05 +0.0%
gen_regexps -0.1% +0.0% 0.00 0.00 +0.0%
genfft -0.1% +0.0% 0.06 0.06 +0.0%
gg -0.5% +0.1% 0.02 0.02 +0.0%
grep -0.1% +0.0% 0.00 0.00 +0.0%
hidden -0.1% +0.0% +4.2% +4.2% +0.0%
hpg -0.4% -0.2% +8.8% +8.8% +0.0%
ida -0.2% +0.0% 0.14 0.14 +0.0%
infer -0.3% +0.0% 0.13 0.13 +0.0%
integer -0.1% +0.0% +0.1% +0.1% +0.0%
integrate -0.1% +0.0% +18.2% +18.2% +0.0%
k-nucleotide -0.1% +0.0% -6.2% -6.2% +0.0%
kahan -0.1% +0.0% +1.1% +1.1% +0.0%
knights -0.2% +1.4% 0.01 0.01 +0.0%
lcss -0.2% +0.0% -3.5% -3.5% +0.0%
life -0.2% +0.0% +2.5% +2.5% +0.0%
lift -0.3% +0.0% 0.00 0.00 +0.0%
listcompr -0.2% +0.0% 0.15 0.15 +0.0%
listcopy -0.2% +0.0% 0.16 0.16 +0.0%
maillist -0.3% +0.0% 0.10 0.11 -3.9%
mandel -0.0% -0.0% 0.12 0.12 +0.0%
mandel2 -0.5% +0.0% 0.00 0.00 +0.0%
minimax -0.3% +0.0% 0.00 0.00 +0.0%
mkhprog -0.2% +0.0% 0.00 0.00 +0.0%
multiplier -0.3% +0.0% +1.8% +1.8% +0.0%
n-body -0.2% +0.0% -0.7% -0.6% +0.0%
nucleic2 -0.1% +0.0% 0.11 0.11 +0.0%
para -0.0% +0.0% +1.4% +0.8% +0.0%
paraffins -0.2% +0.0% +13.6% +13.6% +0.0%
parser -1.3% +3.5% 0.05 0.05 +0.0%
parstof -0.3% +0.0% 0.01 0.01 +0.0%
pic -0.1% +0.1% 0.00 0.00 +0.0%
pidigits -0.1% +0.0% +1.5% +1.9% +0.0%
power -0.1% +0.0% -4.9% -4.9% +0.0%
pretty -0.1% -0.0% 0.00 0.00 +0.0%
primes -0.1% +0.0% 0.11 0.11 +0.0%
primetest -0.0% +0.0% 0.14 0.14 +0.0%
prolog -0.2% +0.1% 0.00 0.17 +0.0%
puzzle -0.3% +0.0% +1.7% +1.7% +0.0%
queens -0.2% +0.0% 0.02 0.02 +0.0%
reptile -0.2% -0.0% 0.02 0.02 +0.0%
reverse-complem -0.3% +0.0% +3.8% +3.8% +0.0%
rewrite -0.2% +0.0% 0.02 0.02 +0.0%
rfib -0.1% +0.0% 0.03 0.03 +0.0%
rsa -0.0% +0.0% 0.03 0.12 +0.0%
scc -0.2% +0.0% 0.00 0.00 +0.0%
sched -0.2% +0.0% 0.03 0.03 +0.0%
scs -0.0% +0.1% -2.9% -3.1% +0.0%
simple -0.9% +0.6% -10.8% -10.8% +13.8%
solid -0.1% +0.0% +3.2% +3.2% +0.0%
sorting -0.1% +0.0% 0.00 0.00 +0.0%
spectral-norm -0.2% +0.0% +0.1% +0.1% +0.0%
sphere -0.2% +0.0% 0.08 0.08 +0.0%
symalg -0.2% +0.0% 0.01 0.04 +0.0%
tak -0.2% +0.0% 0.02 0.02 +0.0%
transform -0.2% +0.0% -0.6% -0.6% +0.0%
treejoin -0.2% +0.0% +0.0% -0.7% +0.0%
typecheck -0.2% -0.0% +0.0% +0.0% +0.0%
veritas +0.7% +0.0% 0.00 0.00 +0.0%
wang -0.1% +0.0% -5.5% -5.5% +0.0%
wave4main -0.2% +0.0% -2.4% -2.4% +0.0%
wheel-sieve1 -0.1% +0.0% -0.5% -0.5% +0.0%
wheel-sieve2 -0.1% +0.0% -6.4% -6.4% +0.0%
x2n1 -0.2% +0.0% 0.00 0.00 +0.0%
--------------------------------------------------------------------------------
Min -1.4% -0.6% -10.8% -10.8% -3.9%
Max +0.7% +3.5% +18.2% +18.2% +100.0%
Geometric Mean -0.2% +0.1% +0.1% +0.1% +0.8%
</pre><p>
On investigation it seemed that although it was good to fuse away the intermediate <em>lists</em>, we were duplicating the individual <tt>Int</tt>s or <tt>Char</tt>s that would otherwise have been shared, and this pushes up allocation. So we often lose a bit and seldom win much.
</p>
<p>
For example, in <tt>spectral/fish</tt> we get a lot more calls to <tt>GHC.CString.unpackAppendCString#</tt>, which allocates not only cons cells (which is fine) but also the characters themselves. Without the <tt>cheapBuild</tt> stuff we just call <tt>(++)</tt> on a shared list of characters.
</p>
<p>
So I'm parking this again. I attach the patches I used.
</p>
<p>
Simon
</p>
TicketsimonpjWed, 13 Feb 2013 08:59:33 GMTattachment set
http://ghc.haskell.org/trac/ghc/ticket/7206
http://ghc.haskell.org/trac/ghc/ticket/7206
<ul>
<li><strong>attachment</strong>
set to <em>cheap-build-ghc.patch</em>
</li>
</ul>
<p>
Patch for GHC itself
</p>
TicketsimonpjWed, 13 Feb 2013 09:00:19 GMTattachment set
http://ghc.haskell.org/trac/ghc/ticket/7206
http://ghc.haskell.org/trac/ghc/ticket/7206
<ul>
<li><strong>attachment</strong>
set to <em>cheap-build-base.patch</em>
</li>
</ul>
<p>
Patch for base library
</p>
TicketsimonpjFri, 08 Nov 2013 15:44:22 GMT
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:6
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:6
<p>
If we do this, don't forget to use <tt>cheapBuild</tt> for <tt>repeat</tt>.
</p>
<p>
Simon
</p>
TicketdfeuerFri, 22 Aug 2014 21:25:56 GMT
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:7
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:7
<p>
Even more kinds of producers makes for even more fusion rules. If we decide to rewrite <em>all</em> consumers to <tt>foldr</tt>, I suppose that then opens up the possibility of having multiple producers instead, but expanding on both ends makes for a rule explosion. I'm also trying, on and off, to see if we can get away with writing all these enumerations using <tt>unfoldr</tt> or similar; I'm not sure how these ideas interact.
</p>
<p>
An idea that might be crazy: introduce a local annotation indicating that a certain function should be treated as <tt>CONLIKE</tt> at a particular call site. So then you'd have just <tt>{-# CHEAP #-} build blah</tt> and leave the fusion rules alone.
</p>
TicketsimonpjThu, 28 Aug 2014 08:33:38 GMT
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:8
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:8
<p>
See also <a class="ext-link" href="http://www.haskell.org/pipermail/ghc-devs/2014-August/006147.html"><span class="icon"></span>this thread</a>, and <a class="ext-link" href="http://www.haskell.org/pipermail/haskell-cafe/2014-August/115765.html"><span class="icon"></span>this one</a>.
</p>
TicketsimonpjThu, 28 Aug 2014 08:36:39 GMT
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:9
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:9
<p>
Another thought. Regardless of what we do here, it seems likely that GHC will sometimes make the wrong choice. So we should give the programmer a way to make the "right" choice.
</p>
<p>
Example. Suppose <tt>[1..10000]</tt> is floated out and shared. Then we should provide <tt>duplicableEnumFromTo 1 10000</tt> which means "please don't try to share me; instead fuse me with my consumers, even if that loses sharing". OK so you lose the nice notation, but you get to say what you want.
</p>
<p>
(And if we switch so that <tt>[1..1000]</tt> is by-default not-shared, then we should provide a way force it to be shared: <tt>nonDuplicableEnumFromTo 1 1000</tt>.
</p>
<p>
Simon
</p>
TicketsnoybergThu, 28 Aug 2014 08:50:19 GMTcc changed
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:10
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:10
<ul>
<li><strong>cc</strong>
<em>snoyberg</em> added
</li>
</ul>
<p>
Following up on a separate discussion on the haskell-cafe, I have a possibly related case:
</p>
<div class="code"><pre><span class="nf">main</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="nf">main</span> <span class="ow">=</span> printLen <span class="o">>></span> printLen
<span class="nf">printLen</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="nf">printLen</span> <span class="ow">=</span> lengthM <span class="mi">0</span> <span class="p">[</span><span class="mi">1</span><span class="o">..</span><span class="mi">40000000</span> <span class="ow">::</span> <span class="kt">Int</span><span class="p">]</span> <span class="o">>>=</span> print
<span class="nf">lengthM</span> <span class="ow">::</span> <span class="kt">Monad</span> m <span class="ow">=></span> <span class="kt">Int</span> <span class="ow">-></span> <span class="p">[</span>a<span class="p">]</span> <span class="ow">-></span> m <span class="kt">Int</span>
<span class="nf">lengthM</span> cnt <span class="kt">[]</span> <span class="ow">=</span> return cnt
<span class="nf">lengthM</span> cnt <span class="p">(</span><span class="kr">_</span><span class="kt">:</span>xs<span class="p">)</span> <span class="ow">=</span>
cnt' <span class="p">`</span>seq<span class="p">`</span> lengthM cnt' xs
<span class="kr">where</span>
cnt' <span class="ow">=</span> cnt <span class="o">+</span> <span class="mi">1</span>
</pre></div><p>
On my system, this takes almost 1.2GB of memory. If I comment out the second call to <tt>printLen</tt>, it takes 44KB.
</p>
TicketedskoThu, 04 Sep 2014 11:17:17 GMTcc changed
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:11
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:11
<ul>
<li><strong>cc</strong>
<em>edsko</em> added
</li>
</ul>
TicketekmettSun, 22 Feb 2015 12:27:36 GMTcc changed
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:12
http://ghc.haskell.org/trac/ghc/ticket/7206#comment:12
<ul>
<li><strong>cc</strong>
<em>ekmett</em> added
</li>
</ul>
Ticket