It is a 185k, 5,800 line module consisting almost entirely of data, class and instance declarations.
It might be interesting to use this module as a test case to profile ghc's front end to see if there are any obvious inefficiencies or unecessary non-linear algorithms.
WASH/HTML/HTMLPrelude.hs is almost as bad. Between the two of them they push the overall compile time for WASH to several minutes with -O0 and nearly half an hour with -O1. GHC's memory use while compiling WASH also grows to over 300Mb with -O0 and over 600Mb with -O1 (on a 64bit box).
All in all, WASH is an excellent stress test for GHC.
Trac metadata
Trac field
Value
Version
6.8.1
Type
Task
TypeOfFailure
OtherFailure
Priority
low
Resolution
Unresolved
Component
Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Multiple
Architecture
Multiple
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
main :: IO ()main = writeFile "W.hs" $ unlines $ map unlines foofoo :: [[String]]foo = [ "module J where", "class C a where", " c :: a -> String", " d :: a -> String", " d x = c x", " e :: a -> String", " e x = c x" ] : [ ["data " ++ d ++ " = " ++ d, "instance C " ++ d ++ " where", " c " ++ d ++ " = \"" ++ d ++ "\""] | i <- [1..1000], let d = 'A' : show i ]
Ian: you absolutely nailed the problem, thank you. I refactored a bit, and behold the specialiser pass runs really fast on that code now. Here's the patch, which might be worth merging to the branch:
Mon Apr 28 16:57:11 BST 2008 simonpj@microsoft.com * Fix Trac #1969: perfomance bug in the specialiser
You might want to check that it cures the problem on the original code?
I don't quite know how to add a test; maybe not worth the trouble.
PS: I've just noticed that the original report described a problem even without -O, so that part at least can't be cured by my patch, since the specialiser only runs with -O.
So, perhaps you can see if that is the case, and re-open if so?
Ian diagnosed the problem. Basically we're compiling lots of things like:
data A24 = A24 instance C A24 where c A24 = "A24"
where
class C a where c :: a -> String d :: a -> String d x = c x e :: a -> String e x = c x
In the ticket 1000 instances are generated, but when I tried that the
compiler was using 1.5G of RAM before I killed it. 100 instances is
plenty to show a problem:
in the (Rec pairs) being passed to occAnalBind, which, via its
idRuleVars, ties everything in a big knot. I assume that what's going on
here is that this is the "normal" c, and there are specialisation rules
for uses of c at the various A* types? And to make things worse, I guess
J.$dmd is not allowed to be a loop breaker, as then the interaction of
rules and inlining could lead to an infinite loop?
So with 100 class instances, we end up with a 500 node SCC being given
to reOrderRec/reOrderCycle. This finds a loop breaker, which presumably
removes at most one instance's worth of definitions from the SCC. Then
it calls stronglyConnCompFromEdgedVerticesR to recompute the scc, and we
go round again. reOrderCycle gets called 200 times, building a new SCC
each time round, with size decreasing roughly linearly from 500 down to
I'm not sure why the space usage is so high; I haven't really looked at
that. I guess we're probably holding on to the old SCCs or something for
some reason. Biographical profiling says that the space is all VOID, but
I'm not convinced that's trustworthy.
Mon Mar 23 10:38:26 GMT 2009 simonpj@microsoft.com * Avoid quadratic complexity in occurrence analysis (fix Trac #1969) The occurrence analyser could go out to lunch in bad cases, because of its clever loop-breaking algorithm. This patch makes it bale out in bad cases. Somewhat ad-hoc: a nicer solution would be welcome. See Note [Complexity of loop breaking] for the details. M ./compiler/simplCore/OccurAnal.lhs -22 +71
The fix is still not perfect, because it simply bales out in the tricky case. My feeling is that there is probably a good algorithm somewhere for "find the minimal set of nodes whose removal will make the graph acyclic", but I don't know of it. Furthermore, the problem is made tricker by the presence of RULES etc; see extensive comments in OccurAnal.
Anyway, baling out certainly fixes the bad complexity.
I think this could merge into 6.10.
Ian: what about a test-suite example? Maybe just a big enough instance to trip the timeout?
Yes, it consistently gave me 15 on OS X. I'll relax it to allow 19 too. Do let me know if you see any more of these failures; I'm not sure how loose the bounds should be.