Opened 11 years ago

Closed 5 years ago

Last modified 4 weeks ago

#915 closed task (invalid)

Implement list fusion using streams instead of foldr/build

Reported by: simonpj Owned by:
Priority: normal Milestone: 7.6.1
Component: libraries/base Version: 6.8
Keywords: fusion Cc: dons@…, duncan.coutts@…, rl@…, Bulat.Ziganshin@…, johan.tibell@…, Deewiant, pumpkingod@…, mhenrion@…, pho@…, wasserman.louis@…, rturk@…, black.meph@…, ezyang@…, p.giarrusso@…, Jake.McArthur@…, leather@…, fuzxxl@…, favonia@…, hackage.haskell.org@…, the.dead.shall.rise@…, David.Feuer@…, sgraf1337@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case: N/A
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

We'd like to try using the stream-fusion idea of Don Stewart, Duncan Coutts and Roman Leshchinskiy, and replace the (somewhat fragile) foldr/build stuff.

See #876.

Change History (52)

comment:1 Changed 11 years ago by simonpj

Owner: set to simonpj

comment:2 Changed 11 years ago by igloo

Test Case: N/A

comment:3 Changed 11 years ago by dons

Architecture: UnknownMultiple
difficulty: UnknownProject (> 1 week)
Keywords: fusion added
Version: 6.4.26.6

This task has its own website now:

http://www.cse.unsw.edu.au/~dons/streams.html.

comment:4 Changed 10 years ago by guest

Cc: Bulat.Ziganshin@… added

it will be great to see this included in ghc base if this work is already completed

comment:5 Changed 10 years ago by duncan

It's not done yet. There's a paper on it but we still have issues with complex list comprehensions.

The new streams stuff will be included in the new bytestring-1.0 package though (as you know, the base package is being split up and ByteString will be in its own package).

comment:6 Changed 10 years ago by guest

comment:7 Changed 10 years ago by simonmar

Milestone: 6.8 branch6.10 branch

comment:8 Changed 10 years ago by simonmar

Owner: simonpj deleted

comment:9 Changed 10 years ago by dons

Cc: dons@… duncan.coutts@… rl@… added
Version: 6.66.8

We should also rerun the benchmarks in the light of the changes that made it into head since April when the fusion library was last benchmarked. There's no that much more to do here, to finish the job.

comment:10 Changed 10 years ago by tibbe

Cc: johan.tibell@… added

comment:11 Changed 9 years ago by igloo

Milestone: 6.10 branch6.12 branch

comment:12 Changed 9 years ago by Deewiant

Cc: Deewiant added

comment:13 Changed 9 years ago by dons

This library has been written and released,

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/stream-fusion

To replace GHC's existing list fusion, we'll need to work out how concatMap is optimised, and push the desugaring for list comprehensions into GHC.

For now though, users can just use the library, and avoid list comprehensions and [n .. m] (use enumFromTo).

comment:14 Changed 9 years ago by simonmar

Architecture: MultipleUnknown/Multiple

comment:15 Changed 9 years ago by simonmar

Operating System: UnknownUnknown/Multiple

comment:16 Changed 8 years ago by pumpkin

Cc: pumpkingod@… added

comment:17 Changed 8 years ago by mux

Cc: mhenrion@… added

comment:18 Changed 8 years ago by PHO

Cc: pho@… added

comment:19 Changed 8 years ago by simonmar

difficulty: Project (> 1 week)Project (more than a week)

comment:20 Changed 8 years ago by LouisWasserman

Type of failure: None/Unknown

I'm interested in pursuing this. What obstacles are present? Perhaps most importantly, what changes need to be made to make this work? Is it just GHC.List, Data.List, etcetera that need to be changed?

comment:21 Changed 8 years ago by LouisWasserman

Cc: wasserman.louis@… added

comment:22 Changed 8 years ago by batterseapower

AFAIK this ticket is no closer to being practical now that when dons added that comment 18 months ago. To implement it you need to change:

  • The list libraries, to use Stream based definitions
  • The desugaring for list comprehensions in GHC

However, it is still the case that noone knows how to implement concat / concatMap efficiently and without improving GHCs Core optimisers. Since concatMap is crucial to the the desugaring of list comprehensions, and also kind of important to users generally, stream fusion isn't really ready for prime time yet, IMHO.

An alternative to coming up with a clever library modification would be to improve GHCs optimiser. I found that the suggested approach of running a SAT pass in a fixed point loop with constructor specialisation was fairly reliable, but I haven't got around to either committing my improved SAT pass or making the fixed point I used not so much of a hack. Even if I did so, it is upsetting that stream fusion requires so much more work on the part of the optimiser to get right, compared to the fairly simply rewrite rule story with foldr/build.

(Both approaches suffer from fragility in the face of a lack of INLINE pragmas).

comment:23 Changed 7 years ago by igloo

Milestone: 6.12 branch6.12.3

comment:24 Changed 7 years ago by igloo

Milestone: 6.12.36.14.1
Priority: normallow

comment:25 Changed 7 years ago by Remi

Cc: rturk@… added

comment:26 Changed 7 years ago by BMeph

Cc: black.meph@… added

comment:27 Changed 7 years ago by igloo

Milestone: 7.0.17.0.2

comment:28 Changed 7 years ago by ezyang

Cc: ezyang@… added

For those playing along at home, more details on the inability to efficiently optimize concat/concatMap (more generally, fusion on nested lists) can be found in Section 7.2 of the "Stream Fusion" paper. T'would probably be nice if the GHC optimizer experiments were public somewhere.

comment:29 Changed 7 years ago by igloo

Milestone: 7.0.27.2.1

comment:30 Changed 6 years ago by Blaisorblade

Cc: p.giarrusso@… added

comment:31 in reply to:  22 Changed 6 years ago by Blaisorblade

Replying to batterseapower:

AFAIK this ticket is no closer to being practical now that when dons added that comment 18 months ago.[...]

However, it is still the case that noone knows how to implement concat / concatMap efficiently and without improving GHCs Core optimisers.[...]

An alternative to coming up with a clever library modification would be to improve GHCs optimiser.

[...I]t is upsetting that stream fusion requires so much more work on the part of the optimiser to get right, compared to the fairly simply rewrite rule story with foldr/build.

I wanted to focus on this part of the comment, which is about whether you actually want to merge the code. If this is agreed upon, then one needs to find somebody willing to finish the job, but that's easier with an agreement.

The paper argues (or claims) that the improved optimization is generally useful and not specific to stream fusion. Do you disagree? Is the code exceedingly ugly? Would you be ready to publish it somewhere? Are there any benchmark results from applying the optimization alone, compared with the standard optimizer? The paper mentions shortly that usually improves nofib performance slightly.

I also wonder: is sometimes the SAT transformation done by hand by programmers for performance?

comment:32 Changed 6 years ago by jmcarthur

Cc: Jake.McArthur@… added

comment:33 Changed 6 years ago by spl

Cc: leather@… added

comment:34 Changed 6 years ago by FUZxxl

Cc: fuzxxl@… added

comment:35 Changed 6 years ago by Favonia

Cc: favonia@… added

comment:36 Changed 6 years ago by liyang

Cc: hackage.haskell.org@… added

comment:37 Changed 6 years ago by refold

Cc: the.dead.shall.rise@… added

comment:38 Changed 6 years ago by igloo

Milestone: 7.2.17.4.1

comment:39 Changed 6 years ago by igloo

Milestone: 7.4.17.6.1
Priority: lownormal

comment:40 Changed 5 years ago by dfeuer

comment:41 Changed 5 years ago by dfeuer

Cc: David.Feuer@… added

comment:42 Changed 5 years ago by dfeuer

The package on hackage no longer compiles.

comment:43 Changed 5 years ago by igloo

Status: newinfoneeded

Duncan, Don,

Is concat/concatMap still an unsolved problem?

If so, can we close this ticket as requiring more research, rather than merely implementation?

comment:44 in reply to:  43 Changed 5 years ago by duncan

Replying to igloo:

Is concat/concatMap still an unsolved problem?

Sort-of. I describe a potential solution in my thesis (section 4.8.3).

If so, can we close this ticket as requiring more research, rather than merely implementation?

Yes.

comment:45 Changed 5 years ago by igloo

Resolution: invalid
Status: infoneededclosed

comment:46 Changed 4 years ago by duncan

We do now have a solution, however the current implementation relies on HERMIT. See:

The HERMIT in the Stream -- Fusing Stream Fusion’s concatMap by Andrew Farmer, Christian Höner zu Sierdissen and Andy Gill.

http://ittc.ku.edu/~afarmer/concatmap-pepm14.pdf

See also this summary/commentary: http://blog.ezyang.com/2014/01/pepm14-the-hermit-in-the-stream/

comment:47 Changed 10 months ago by George

Given the preceding comment by Duncan does it make sense to reopen this as a potential task?

comment:48 Changed 4 weeks ago by sgraf

On my journey through fusion land, I fell in love with the idea of stream fusion and spent quite some time thinking about the concatMap problem.

Apart from the HERMIT approach based on rewriting a (albeit large) subset of concatMap calls, I think the solution lies in call-pattern specialisation of function arguments, as already recognised in section 6.2 of the original paper on call-pattern specialisation.

The paper goes on to argue that it's too brittle to have rules matching on lambda terms. But then again, I don't see why we even need a RULE matching on a specific lambda term (or even terms with exprArity > 0), as these things are pretty much unique everytime they occur. E.g., having a RULE deduplicate specialisations would probably not help much anyway. So, if we would employ some other mechanism than rewrite rules to specialise call sites, as I understand SpecConstr currently does, I think we would be fine, wouldn't we?

At least this could get rid of the concatMap roadblock, are there any others I'm not aware of?

comment:49 Changed 4 weeks ago by sgraf

Cc: sgraf1337@… added

comment:50 Changed 4 weeks ago by simonpj

if we would employ some other mechanism than rewrite rules to specialise call sites

I don't know what you have in mind here. Would you care to elaborate?

comment:51 in reply to:  50 Changed 4 weeks ago by sgraf

Replying to simonpj:

if we would employ some other mechanism than rewrite rules to specialise call sites

I don't know what you have in mind here. Would you care to elaborate?

Right, that's quite a bummer. I'll probably have to develop a deeper understanding of how SpecConstr and the rewrite rule mechanism works before I can answer this.

comment:52 Changed 4 weeks ago by pggiarrusso

If anybody is unaware and interested: the latest work on the topic is "Stream Fusion, to Completeness" (https://strymonas.github.io/), and in Sec. 8 it seems to claim advantages relative to the HERMIT work (though I've only skimmed this), so that might be relevant.

Note: See TracTickets for help on using tickets.