Opened 8 years ago

Closed 20 months ago

Last modified 3 months ago

#915 closed task (invalid)

Implement list fusion using streams instead of foldr/build

Reported by: simonpj Owned by:
Priority: normal Milestone: 7.6.1
Component: libraries/base Version: 6.8
Keywords: fusion Cc: dons@…, duncan.coutts@…, rl@…, Bulat.Ziganshin@…, johan.tibell@…, Deewiant, pumpkingod@…, mhenrion@…, pho@…, wasserman.louis@…, rturk@…, black.meph@…, ezyang@…, p.giarrusso@…, Jake.McArthur@…, leather@…, fuzxxl@…, favonia@…, hackage.haskell.org@…, the.dead.shall.rise@…, David.Feuer@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Project (more than a week)
Test Case: N/A Blocked By:
Blocking: Related Tickets:

Description

We'd like to try using the stream-fusion idea of Don Stewart, Duncan Coutts and Roman Leshchinskiy, and replace the (somewhat fragile) foldr/build stuff.

See #876.

Change History (46)

comment:1 Changed 8 years ago by simonpj

  • Owner set to simonpj

comment:2 Changed 8 years ago by igloo

  • Test Case set to N/A

comment:3 Changed 7 years ago by dons

  • Architecture changed from Unknown to Multiple
  • Difficulty changed from Unknown to Project (> 1 week)
  • Keywords fusion added
  • Version changed from 6.4.2 to 6.6

This task has its own website now:

http://www.cse.unsw.edu.au/~dons/streams.html.

comment:4 Changed 7 years ago by guest

  • Cc Bulat.Ziganshin@… added

it will be great to see this included in ghc base if this work is already completed

comment:5 Changed 7 years ago by duncan

It's not done yet. There's a paper on it but we still have issues with complex list comprehensions.

The new streams stuff will be included in the new bytestring-1.0 package though (as you know, the base package is being split up and ByteString? will be in its own package).

comment:6 Changed 7 years ago by guest

comment:7 Changed 6 years ago by simonmar

  • Milestone changed from 6.8 branch to 6.10 branch

comment:8 Changed 6 years ago by simonmar

  • Owner simonpj deleted

comment:9 Changed 6 years ago by dons

  • Cc dons@… duncan.coutts@… rl@… added
  • Version changed from 6.6 to 6.8

We should also rerun the benchmarks in the light of the changes that made it into head since April when the fusion library was last benchmarked. There's no that much more to do here, to finish the job.

comment:10 Changed 6 years ago by tibbe

  • Cc johan.tibell@… added

comment:11 Changed 6 years ago by igloo

  • Milestone changed from 6.10 branch to 6.12 branch

comment:12 Changed 6 years ago by Deewiant

  • Cc Deewiant added

comment:13 Changed 6 years ago by dons

This library has been written and released,

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/stream-fusion

To replace GHC's existing list fusion, we'll need to work out how concatMap is optimised, and push the desugaring for list comprehensions into GHC.

For now though, users can just use the library, and avoid list comprehensions and [n .. m] (use enumFromTo).

comment:14 Changed 6 years ago by simonmar

  • Architecture changed from Multiple to Unknown/Multiple

comment:15 Changed 6 years ago by simonmar

  • Operating System changed from Unknown to Unknown/Multiple

comment:16 Changed 5 years ago by pumpkin

  • Cc pumpkingod@… added

comment:17 Changed 5 years ago by mux

  • Cc mhenrion@… added

comment:18 Changed 5 years ago by PHO

  • Cc pho@… added

comment:19 Changed 4 years ago by simonmar

  • Difficulty changed from Project (> 1 week) to Project (more than a week)

comment:20 Changed 4 years ago by LouisWasserman

  • Type of failure set to None/Unknown

I'm interested in pursuing this. What obstacles are present? Perhaps most importantly, what changes need to be made to make this work? Is it just GHC.List, Data.List, etcetera that need to be changed?

comment:21 Changed 4 years ago by LouisWasserman

  • Cc wasserman.louis@… added

comment:22 follow-up: Changed 4 years ago by batterseapower

AFAIK this ticket is no closer to being practical now that when dons added that comment 18 months ago. To implement it you need to change:

  • The list libraries, to use Stream based definitions
  • The desugaring for list comprehensions in GHC

However, it is still the case that noone knows how to implement concat / concatMap efficiently and without improving GHCs Core optimisers. Since concatMap is crucial to the the desugaring of list comprehensions, and also kind of important to users generally, stream fusion isn't really ready for prime time yet, IMHO.

An alternative to coming up with a clever library modification would be to improve GHCs optimiser. I found that the suggested approach of running a SAT pass in a fixed point loop with constructor specialisation was fairly reliable, but I haven't got around to either committing my improved SAT pass or making the fixed point I used not so much of a hack. Even if I did so, it is upsetting that stream fusion requires so much more work on the part of the optimiser to get right, compared to the fairly simply rewrite rule story with foldr/build.

(Both approaches suffer from fragility in the face of a lack of INLINE pragmas).

comment:23 Changed 4 years ago by igloo

  • Milestone changed from 6.12 branch to 6.12.3

comment:24 Changed 4 years ago by igloo

  • Milestone changed from 6.12.3 to 6.14.1
  • Priority changed from normal to low

comment:25 Changed 4 years ago by Remi

  • Cc rturk@… added

comment:26 Changed 3 years ago by BMeph

  • Cc black.meph@… added

comment:27 Changed 3 years ago by igloo

  • Milestone changed from 7.0.1 to 7.0.2

comment:28 Changed 3 years ago by ezyang

  • Cc ezyang@… added

For those playing along at home, more details on the inability to efficiently optimize concat/concatMap (more generally, fusion on nested lists) can be found in Section 7.2 of the "Stream Fusion" paper. T'would probably be nice if the GHC optimizer experiments were public somewhere.

comment:29 Changed 3 years ago by igloo

  • Milestone changed from 7.0.2 to 7.2.1

comment:30 Changed 3 years ago by Blaisorblade

  • Cc p.giarrusso@… added

comment:31 in reply to: ↑ 22 Changed 3 years ago by Blaisorblade

Replying to batterseapower:

AFAIK this ticket is no closer to being practical now that when dons added that comment 18 months ago.[...]

However, it is still the case that noone knows how to implement concat / concatMap efficiently and without improving GHCs Core optimisers.[...]

An alternative to coming up with a clever library modification would be to improve GHCs optimiser.

[...I]t is upsetting that stream fusion requires so much more work on the part of the optimiser to get right, compared to the fairly simply rewrite rule story with foldr/build.

I wanted to focus on this part of the comment, which is about whether you actually want to merge the code. If this is agreed upon, then one needs to find somebody willing to finish the job, but that's easier with an agreement.

The paper argues (or claims) that the improved optimization is generally useful and not specific to stream fusion. Do you disagree? Is the code exceedingly ugly? Would you be ready to publish it somewhere? Are there any benchmark results from applying the optimization alone, compared with the standard optimizer? The paper mentions shortly that usually improves nofib performance slightly.

I also wonder: is sometimes the SAT transformation done by hand by programmers for performance?

comment:32 Changed 3 years ago by jmcarthur

  • Cc Jake.McArthur@… added

comment:33 Changed 3 years ago by spl

  • Cc leather@… added

comment:34 Changed 3 years ago by FUZxxl

  • Cc fuzxxl@… added

comment:35 Changed 3 years ago by Favonia

  • Cc favonia@… added

comment:36 Changed 3 years ago by liyang

  • Cc hackage.haskell.org@… added

comment:37 Changed 3 years ago by refold

  • Cc the.dead.shall.rise@… added

comment:38 Changed 3 years ago by igloo

  • Milestone changed from 7.2.1 to 7.4.1

comment:39 Changed 2 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1
  • Priority changed from low to normal

comment:40 Changed 20 months ago by dfeuer

comment:41 Changed 20 months ago by dfeuer

  • Cc David.Feuer@… added

comment:42 Changed 20 months ago by dfeuer

The package on hackage no longer compiles.

comment:43 follow-up: Changed 20 months ago by igloo

  • Status changed from new to infoneeded

Duncan, Don,

Is concat/concatMap still an unsolved problem?

If so, can we close this ticket as requiring more research, rather than merely implementation?

comment:44 in reply to: ↑ 43 Changed 20 months ago by duncan

Replying to igloo:

Is concat/concatMap still an unsolved problem?

Sort-of. I describe a potential solution in my thesis (section 4.8.3).

If so, can we close this ticket as requiring more research, rather than merely implementation?

Yes.

comment:45 Changed 20 months ago by igloo

  • Resolution set to invalid
  • Status changed from infoneeded to closed

comment:46 Changed 3 months ago by duncan

We do now have a solution, however the current implementation relies on HERMIT. See:

The HERMIT in the Stream -- Fusing Stream Fusion’s concatMap by Andrew Farmer, Christian Höner zu Sierdissen and Andy Gill.

http://ittc.ku.edu/~afarmer/concatmap-pepm14.pdf

See also this summary/commentary: http://blog.ezyang.com/2014/01/pepm14-the-hermit-in-the-stream/

Note: See TracTickets for help on using tickets.