generalize filterM, mapAndUnzipM, zipWithM, zipWithM_, replicateM, replicateM_

changed weight to 5

added Tbug Trac import core libraries labels

+1

changed milestone to %8.0.1

It might be nice to take the opportunity to define replicateM directly, rather than doing so in terms of sequenceA and replicate.

Trac metadata

Trac field	Value
CC	RyanGlScott, core-libraries-committee@haskell.org, gidyn → RyanGlScott, core-libraries-committee@haskell.org, ekmett, gidyn

For the record there is no objection here.

[ticket:10168#comment:107696 dfeuer]: You mean like

replicateM 0 _ = pure []
replicateM n x = liftA2 (:) x (replicateM (n-1) x)

?

strake888, do you think you could put this up on Phabricator when you have a patch addressing dfeuer's comment?

Attached file 0001-generalize-filterM-mapAndUnzipM-zipWithM-zipWithM_-r.patch (download).

Replying to [ticket:10168#comment:108098 bgamari]: Yes, done

Trac metadata

Trac field	Value
Differential revisions	→ D1324

mentioned in commit 741cf18a

@strake888:

The notion I think dfeuer was trying to get at there was that replicateM can do 'peasant multiplication' like (^) does to get away with O(log n) distinct calls to (<*>).

Replying to [ticket:10168#comment:109846 ekmett]:

@strake888:

The notion I think dfeuer was trying to get at there was that replicateM can do 'peasant multiplication' like (^) does to get away with O(log n) distinct calls to (<*>).

Don't give me so much credit; I was just thinking about eliminating the need for list fusion rules to fire to prevent GHC from building an intermediate list. The peasant multiplication idea is quite an interesting one. It would sometimes be good, and I don't think it could ever be terrible, but it certainly could be worse in some cases. I'd probably prefer separate combinators for fast exponentiation of Monoids and Applicatives.

closed

Trac metadata

Trac field	Value
Resolution	Unresolved → ResolvedFixed

The difference is it can only be worse by a O(log n) factor worth of stack usage.

On the other hand it can be better by an O(n / log n) factor in terms of both time and space.

The upside dwarfs the downside.

e.g. We carefully use the doubling scheme mentioned above to handle things replicateM and the like in Data.Sequence and it makes a night and day difference in the performance of many combinators for generating those structures.

My concern is that it will add that tiny factor to tight loops in IO and ST where every tiny little bit counts. Admittedly, the fact that I can't use an STRef s Int counter and get GHC to put it in a register rules out the use of replicateM_ for a significant class of such loops, but I suspect my point is still valid.

Data.Sequence is a bit weird. I doubt too many things offer a Wasserman special like that replicateA. Do you know how the doubling approach plays with branch prediction?

I doubt too many things offer a Wasserman special like that replicateA.

Consumption of the result with a Writer, Const, Reader, update monads, all benefit rather disproportionally. The Const case in particular is relevant as it basically shows how we'd foldMap and it and the writer case bring in every Monoid as something that can be folded over optimally. In general the idea of randomly inducing a bias here ensure that nothing smart can be done by the consumer. You enforce worst-case asymptotic behavior by losing sharing.

One option to compromise might be to look at a way that RULES can try out the biased association and then write back if it doesn't fuse away?

Or even simpler, add RULES to rewrite the IO / ST cases into the biased form. We provably get no benefit from the tree for that style of effect and can't observe the difference in associativity for these concrete cases.

As you note basically any transformer is going to have enough overhead to swamp this concern, so it is really only those sorts of things that are problematic.

I was too tired when I wrote that remark, so I forgot about transformers. StateT u (ST s) actually is a perfectly beautiful target for replicateM_. With the current definition, that *can* be unboxed, allowing for some serious vector crunching. I think your version should be in the library; I just think it should have a new name.

I would be worried about not just IO but also some kind of structure that is intended to be produced and consumed in a streaming fashion, in O(1) space. Seems like that could blow up space usage arbitrarily with a repeated-squaring implementation.

Including separate versions that use repeated squaring would make more sense (if we should include them at all; the use cases seem obscure and anyone can (already) just provide them in a package outside base).

I'll give up on this for now.

added Pnormal label

Trac field	Value
Version	7.10.1-rc1
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Core Libraries
Test case
Differential revisions
BlockedBy
Related
Blocking
CC	core-libraries-committee@haskell.org
Operating system
Architecture

generalize filterM, mapAndUnzipM, zipWithM, zipWithM_, replicateM, replicateM_

Child items 0

Activity