Opened 22 months ago

Closed 4 months ago

#10830 closed bug (fixed)

maximumBy has a space leak

Reported by: NeilMitchell Owned by:
Priority: high Milestone: 8.2.1
Component: Core Libraries Version: 7.10.2
Keywords: Cc: ekmett
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D1205, Phab:D3019
Wiki Page:

Description

Given the program:

import Data.List
main = print $ maximumBy compare [1..10000]

Compiling with -O2, on GHC 7.8.3 this runs in constant stack space (works fine with +RTS -K1K). With GHC 7.10.2 I get:

$ ghc --make Test.hs -O2 -rtsopts
[1 of 1] Compiling Main             ( Test.hs, Test.o )
Linking Test.exe ...

$ Test +RTS -K100K
Stack space overflow: current size 33680 bytes.
Use `+RTS -Ksize -RTS' to increase it.

Not sure why it's failing at 33K instead of 100K, but it's certainly taking more than 1K as GHC 7.8.3 did.

See #3416 for previous discussion of this issue. My guess is that in older versions of GHC the strictness analysis managed to kick in and optimise things. With the burnt bridges that no longer works.

Change History (34)

comment:1 Changed 22 months ago by nomeata

Possible duplicate of #10788?

comment:2 Changed 22 months ago by NeilMitchell

I don't think so - maximum doesn't have a space leak, but maximumBy does, therefore it's a very different result.

comment:3 Changed 22 months ago by nomeata

In 7.8, the core is nice and fully inlined:

Rec {
main_lgo
main_lgo =
  \ z_a1k8 ds2_a1k9 ->
    case ds2_a1k9 of _ {
      [] -> z_a1k8;
      : x_a1kh xs_a1ki ->
        case compareInteger z_a1k8 x_a1kh of _ {
          __DEFAULT -> main_lgo x_a1kh xs_a1ki;
          GT -> main_lgo z_a1k8 xs_a1ki
        }
    }
end Rec }

In HEAD, we have a call to foldr1:

main5 =
  \ x_a2pT y_a2pU ->
    case compareInteger x_a2pT y_a2pU of _ {
      __DEFAULT -> y_a2pU;
      GT -> x_a2pT
    }
main2 = enumDeltaToInteger1 main4 main3
main1 =
  \ eta_B1 ->
    case foldr1 main5 main2 of _ { __DEFAULT -> (# eta_B1, () #) }

I think you are right that this is some BBP-fallout. Note that maximumBy is implemented in Data.Foldable using foldr1.

If I make foldr1 inlinable (by using a local worker), I get

Rec {
-- RHS size: {terms: 19, types: 9, coercions: 0}
main_go
main_go =
  \ ds2_a1Fi eta_a1Fj ->
    case ds2_a1Fi of _ {
      [] -> eta_a1Fj;
      : y_a1Fr ys_a1Fs ->
        case compareInteger eta_a1Fj y_a1Fr of _ {
          __DEFAULT -> main_go ys_a1Fs y_a1Fr;
          GT -> main_go ys_a1Fs eta_a1Fj
        }
    }
end Rec }

which is equivalent to the 7.8 code.

Given that we inline foldr, I do not see a reason not to inline foldr1. DR coming soon...

comment:4 Changed 22 months ago by nomeata

Differential Rev(s): Phab:D1205
Status: newpatch

comment:5 Changed 22 months ago by thomie

Not sure why it's failing at 33K instead of 100K

See #10445.

comment:6 Changed 22 months ago by NeilMitchell

@thomie: Note that #10445 are instances of GHC failing at a higher stack size than you requested, while above is an example of it failing at a lower stack size. Not sure if that makes it different or not though.

Thanks @nomeata!

comment:7 Changed 22 months ago by Joachim Breitner <mail@…>

In 85915e9/ghc:

Make Data.List.foldr1 inline

Previously, foldr1 would be defiend recursively and thus not inline.
This is bad, for example, when maximumBy has a strict comparison
function: Before the BBP, it was implemented via foldl1, which inlined
and yielded good code. With BBP, it goes via foldr1, so we better inline
this as well. Fixes #10830.

Differential Revision: https://phabricator.haskell.org/D1205

comment:8 Changed 22 months ago by nomeata

Resolution: fixed
Status: patchclosed

comment:9 Changed 22 months ago by rwbarton

Is #10788 related, btw?

comment:10 Changed 22 months ago by thoughtpolice

Milestone: 7.10.3

If 7.10.3 ever exists (no commitment!), then this should go in.

comment:11 in reply to:  9 Changed 22 months ago by nomeata

Replying to rwbarton:

Is #10788 related, btw?

Only morally related, not technially: minimumBy goes via Foldable stuff, so we end up with foldr1, while minimum is (yet) a member of the class and for lists, a direct definition is given, which has the inlining difficulties discussed in #10788.

comment:12 Changed 21 months ago by NeilMitchell

I'm a little confused. The diff fixes foldr1, I can't quite understand how any version based on foldr1 could ever avoid the space leak - although I agree that the inlining may reduce the space leak by a constant factor. I also note that the test has 100K of stack, while with GHC 7.8 it was fine in 1K of stack - perhaps the test bound should be significantly reduced? And finally, the code as currently written, maximumBy calls foldl not foldr, did that change after your patch went in?

comment:13 Changed 21 months ago by nomeata

Resolution: fixed
Status: closednew

I'm a little confused.

So am I :-)

Looks like I modified foldr1, while the test case still tested maximumBy from Data.OldList which still goes via foldl1, and thus was always ok for the test case.

The maximumBy that one gets from Data.List is the one from Foldable, which goes via foldr1, so it stills blows the stack.

But I’m not sure what to do here. We might be able to special-case Int and Integer, but that does not seem to be a good solution...

comment:14 Changed 21 months ago by NeilMitchell

It's pretty rare that you are using maximumBy with a custom function and the result type is Int or Integer - typically you'd just use minimum or maximum directly - so I don't think special cases are likely to be of any benefit. I think using foldl1 is the only way to reduce the space leak, which is what happened in base-4.7 for Data.List. The Foldable variants have always been foldr1, but maybe that's a bug in Foldable? It's certainly a BBP regression.

comment:15 Changed 21 months ago by nomeata

Yes, ignore the comment about special-casing. That only works for minimum and maximum.

I’ve prodded Herbert and Edward on IRC about this, as this is more a BBP design issue than a compiler bug.

comment:16 in reply to:  10 Changed 21 months ago by George

Replying to thoughtpolice:

If 7.10.3 ever exists (no commitment!), then this should go in.

Shouldn't priority be set to high if we want it in 7.10.3?

comment:17 Changed 20 months ago by bgamari

Resolution: fixed
Status: newclosed

Merged to ghc-7.10.

comment:18 Changed 20 months ago by nomeata

Resolution: fixed
Status: closednew

While the patch is nice, I don’t think this bug is fixed (see comment:13). Also see https://mail.haskell.org/pipermail/libraries/2015-October/026430.html

comment:19 Changed 20 months ago by bgamari

Milestone: 7.10.38.0.1

Oops! Sorry, I overlooked that.

I'm going to re-milestone to 8.0 since there is currently no fix.

comment:20 Changed 18 months ago by thomie

Operating System: WindowsUnknown/Multiple
Type of failure: None/UnknownRuntime performance bug

comment:21 Changed 17 months ago by rwbarton

This is also a semantic bug because the Haskell Report specifies that maximumBy has the behavior of foldl1.

We fixed this for sum and maximum and so on by making them instance methods of Foldable. We should just do the same for maximumBy and minimumBy.

comment:23 Changed 17 months ago by rwbarton

Cc: ekmett added
Component: libraries/baseCore Libraries

comment:24 Changed 16 months ago by bgamari

Milestone: 8.0.18.2.1

This bug won't be fixed in 8.0.1; bumping to 8.2.

comment:25 Changed 6 months ago by rwbarton

Priority: normalhigh

comment:26 Changed 6 months ago by ekmett

In response to Reid's prodding, I see two decisions to make here:

1.) There are ultimately 3 possible default implementations of minimumBy/maximumBy based on foldl1, foldr1, or foldMap respectively.

2.) We also need to decide whether we should add maximumBy and minimumBy to the class.

maximum and minimum being in the class can improve the asymptotics of these operations. e.g. Set offers O(log n) minimum and maximum today, unlike before. Moreover, in the case of minimum and maximum 'a' is a fully knowable type for some containers, so they can even handle some infinite cases.

Having them in the class also happened to also let us paper over the difference between the left fold and right fold in the old Foldable code for maximum and minimum for lists and fix the analogous space leak there. We didn't go with a right fold, but went with a monoidal fold instead as it was never asymptotically worse than the right fold we had in Foldable before.

But here the situation is a bit different. Without more knowledge about the b type in the (a -> b) fitness function passed in there is no scenario in which a foldr1-based maximumBy can get early termination, so there is no real justification to the current implementation. It is leakier than a foldl1 solution and never better!

Even trying to argue there is some theoretical kind of snoclist container doesn't pan out, if our chosen semantics is to return the first value with the maximum value by fitness function, you'd have to reassociate all the way anyways.

So at the very least there is no justification for a foldr1 based solution.

Ultimately, maximumBy and minimumBy are both operations that only make sense for finite containers. Here we're stuck touching all of the distinct members of the container at the least once as we don't know what fitness function the user will supply or any properties of the output type, to say, get early termination by reaching minBound/maxBound, or exploit properties of the type, like the laziness of "Lazy Nat" max.

Adding maximumBy/minimumBy to the class would in theory allow a few obscure cases: The only real thing they can do is take advantage of the knowledge of which distinct 'a's they hold and send them through the scoring function individually without repetition. This would allow some containers to exploit a monoidal fold and pay proportional to the number of distinct values in the container rather than the total number of values. e.g. something like http://hackage.haskell.org/package/compressed-3.10/docs/Data-Compressed-LZ78.html could either just skim the tokens rather than do a full decompression, or use a monoidal fold to get most of that benefit in practice. If we have k distinct values, a monoidal fold version of maximumBy for some containers that tracked distinct values could get O(k) rather than O(n). Somewhat weaker, the compressed code above could pay proportional to the size of the compressed data set, not the full data set.

Basing something on foldMap without the ability to redefine it in the class would effectively result in the bad foldr-style stack usage for lists.

And frankly, it'd be nice to have a better story rather than 'lets favor monoidal implementations but hack in left folds for lists' pattern that seems to be cropping up over and over!

Off the cuff, we could possibly address the root cause of this and a couple other issues by adding a foldMap' that uses a foldl' evaluation order on lists and strict monoidal accumulator and basing the (possibly default) definition of both maximum and maximumBy on that combinator. This appeals to me as it could serve to reduce pressure to add more members to the class of this sort, while avoiding more of these sorts of stack space regressions.

I say "off the cuff" as I haven't had time to work through all the consequences of such a combinator. and it isn't the only design in this space, other designs in this space include making a single hideous Frankenstein combinator that takes multiple fold types worth of arguments and chooses an implementation accordingly, etc.

A straw man solution would be to add foldMap' to the class, switch maximum and maximumBy to invoke it, and have it perform a monoidal fold in foldl' style either by default or just in the list case. As an embellishment on that design we could even add sum' and product' combinators that went through this path rather than randomly leak space.

This would need some design work to work through consequences and implementation, but seems doable.

tl;dr At the very least I agree that the current foldr1 implementation of maximumBy and minimumBy is a mistake.

comment:27 Changed 6 months ago by rwbarton

I'm happy with any resolution under which maximumBy on a [a] is defined in terms of foldl1 (presuming that, as a result, strictness analysis can remove the space leak for maximumBy on a [Int], as was the case in GHC 7.8). I didn't really mean to push for the "add a new method to Foldable" solution specifically. (That was motivated by maintaining the old behavior for other Foldable instances as far as possible, but if the old behavior is actually useless then that motivation doesn't apply.)

If there's a minimal change that we could apply now to obtain the above condition, while leaving the door open to future extensions, I'd be in favor of making the minimal change now, since the current situation with maximumBy on lists is really not very good.

comment:28 Changed 6 months ago by ekmett

I'm okay with switching maximumBy and minimumBy to foldl1 for now. This would at least fix the stack space regression relative to the original [] implementation. It'd be a big step in the right direction from the status quo.

If later on we can come up with a foldMap' or similar alternative with palatable semantics we can switch over to that and we should be able to retain the stack benefits. This would buy us time to fiddle with the semantics and implementation of such a combinator and see how well it can optimize.

Breaking up the fix like this would avoid letting the quest for the perfect solution derail us from fixing an annoying regression in a common combinator today, and even if the second stage never happened this path would still make most consumers happy.

comment:29 Changed 5 months ago by dalaing

Differential Rev(s): Phab:D1205Phab:D1205, Phab:D3109
Status: newpatch

comment:30 Changed 5 months ago by dalaing

I'm not sticking my hand up to dive into the depths of this, but I made the change from foldr1 to foldl1 and run a validate in case that's what people want for the time being.

Feel free to knock the patch on the head if folks want to go for a deeper / more involved approached.

comment:31 Changed 5 months ago by ekmett

Sounds good.

comment:32 Changed 4 months ago by RyanGlScott

Differential Rev(s): Phab:D1205, Phab:D3109Phab:D1205, Phab:D3019

comment:33 Changed 4 months ago by Ben Gamari <ben@…>

In 2e438482/ghc:

Fixes a spaceleak in `maximumBy` and `minimumBy` (#10830).

This involved changing the implementation from using
`foldr11` to using `foldl1`.

Test Plan: validate

Reviewers: austin, hvr, bgamari, dalaing, dfeuer

Reviewed By: bgamari

Subscribers: RyanGlScott, dfeuer, thomie

Differential Revision: https://phabricator.haskell.org/D3019

comment:34 Changed 4 months ago by RyanGlScott

Resolution: fixed
Status: patchclosed

I'm going to close this ticket, as the immediate issue of maximumBy and minimumBy using too much stack space with lists has been fixed with 2e43848236a4b80015d8fb09a87f6f6a746c1365.

There's still the other issue of ways to improve the current situation even better that ekmett discusses in comment:26, but I believe that is best left to a separate ticket.

Note: See TracTickets for help on using tickets.