Opened 6 years ago

Closed 5 years ago

#3630 closed feature request (invalid)

Suggested algorithm to control upper bound of space "leaks"

Reported by: shelbymoore3 Owned by:
Priority: normal Milestone:
Component: Compiler Version: 6.10.4
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

An idea for an algorithm to mitigate space "leaks".

Limited research (thus far) reveals that space "leaks" due to laziness are desireable function of the matrix of design choices and may be better automatically controlled in runtime (read both links to fully understand):

http://www.haskell.org/pipermail/haskell-cafe/2009-October/068382.html
http://www.haskell.org/pipermail/cvs-ghc/2009-October/050928.html

Proposes to fix to this bug (in theory):

http://hackage.haskell.org/trac/ghc/ticket/917

May you can cross-link from the bug, so people can read my links above to get a deeper understanding of why space "leaks" are really a feature, not a bug. And others can think about my idea at links above for a deterministic runtime upper bound solution (that proposes to be orthogonal to profiling and static optimization).

Change History (6)

comment:1 in reply to: ↑ description Changed 6 years ago by shelbymoore3

I cover in my links above the concept that:

Lazy space "leaks" are a desirable slider?

The point is that we should have both granular and brute-force (global) control, both at compile+profile and at run-time, over the static<->lazy trade-off. Then there is no significant disadvantage relative to other languages/run-times, and more advantages.

And I offer suggested algorithm for the brute-force control at run-time.

comment:3 follow-up: Changed 6 years ago by simonmar

  • difficulty set to Unknown

I still don't have a clear idea what it is you want to do.

I think you want to turn off updates for certain thunks, based on some metrics (or profile-directed feedback?). I pointed out that disabling thunk update might actually cause a space leak, because the free variables of the thunk have to be retained. Also you don't explain how you're going to decide which thunks should have update disabled.

I think probably a ticket is not the right place for this discussion - a wiki page might be better?

comment:4 in reply to: ↑ 3 Changed 6 years ago by shelbymoore3

Readers can continue clicking "next message" on the links I have provided to see all the discussion. For example:

http://www.haskell.org/pipermail/cvs-ghc/2009-November/050949.html

In link above I offered the idea that the decision on which thunks to not update would probably be best determined not individually, but stochastic-ally on timer in groups by determining if the (space allocated per unit time / thunks updated per unit time) was excessive given the level of paging (fault) load last reported by the virtual memory manager, and if so then reverting those memo-izations.

I understand that you said (at the mailing list), that we can not revert (discard the memo/normal form) without retaining the free variables, so I suggested at link above that we would retain these only temporarily in the GC nursery and make the decision on whether to revert, before these evaluated thunks are pushed out of the nursery. There would be no need to retain the revert overhead until the paging fault load is approaching the level we need to start throttling space allocation rate. We do not want to employ to binary (on/off) approach, but rather a rate reduction approach (so that gradually space intensive operations are not memo-ized, so that our annealing coverges, see http://www.haskell.org/pipermail/haskell-cafe/2009-November/068479.html).

We are not trying to kill all space leaks (e.g. free variables), but rather gradually slow down the rate (velocity) of which we are creating additional space allocation as we gradually approach some intolerable measure of the paging fault load. We know that paging fault load is orders-of-magnitude more costly performance-wise than CPU load. With a high paging fault load, the program can be unusable. So we do not need to worry about free variables retaining space allocation, as we are approaching this stochastically with global measure of the impact of additional space allocation (i.e. paging load).

Hopefully the above achieves a trade-off in absolute speed, for tolerable speed in spite of what would have been crushing space leaks. And hopefully there is no impact on absolute speed at all, until paging load has become a factor.

Note even though I have also tried to suggest this area of work (and alternative approaches) for a master's thesis, I think perhaps the solution above is straight forward enough to proceed:

http://www.haskell.org/pipermail/haskell-cafe/2009-November/068638.html

Maybe I would be willing to experiment if I had a better understanding of the Haskell source code. I suppose I could do that at some time in future, if no one jumps on this before then. It might be a year or more from now. I would prefer to see someone earn their masters on this, because I would get no compensation at all for working on this, and Haskell is not (yet) my area of specialization. It is nice to leave some work for fresh students, since this seems to be in the theoretical realm (as much as in the practical realm).

If a wiki is better, must I create one then post a link to it here?

P.S. Simon apologies if I have overly impinged on your time or if I have made any gross error. I am trying to help.

comment:5 Changed 6 years ago by simonmar

  • Milestone set to _|_
  • Type changed from proposal to feature request

comment:6 Changed 5 years ago by simonmar

  • Resolution set to invalid
  • Status changed from new to closed
  • Type of failure set to None/Unknown
Note: See TracTickets for help on using tickets.