Opened 4 years ago

Closed 4 years ago

#9583 closed bug (fixed)

Simplifier ticks exhausted while compiling Cabal HEAD

Reported by: ezyang Owned by:
Priority: highest Milestone:
Component: Compiler Version: 7.9
Keywords: Cc: dreixel
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time crash Test Case:
Blocked By: Blocking:
Related Tickets: #9630 Differential Rev(s):
Wiki Page:

Description

Steps to reproduce:

  1. Clone Cabal HEAD
  2. cabal configure --with-ghc=/path/to/ghc-head
  3. dist/setup/setup build

Actual results:

[ 8 of 78] Compiling Distribution.Text ( Distribution/Text.hs, dist/build/Distribution/Text.o )
[ 9 of 78] Compiling Distribution.Version ( Distribution/Version.hs, dist/build/Distribution/Version.o )
ghc-stage2: panic! (the 'impossible' happened)
  (GHC version 7.9.20140911 for x86_64-unknown-linux):
        Simplifier ticks exhausted
    When trying UnfoldingDone GHC.Word.$fNumWord8_$c+
    To increase the limit, use -fsimpl-tick-factor=N (default 100)
    If you need to do this, let GHC HQ know, and what factor you needed
    To see detailed counts use -ddump-simpl-stats
    Total ticks: 813720

Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

I know that it still fails with tick factor 4096; I haven't found a tick amount which makes it succeed.

Change History (19)

comment:1 Changed 4 years ago by simonpj

How do I "clone Cabal Head"? Perhaps

git clone http://git.haskell.org/packages/Cabal.git

Just want to be sure we are talking about the same thing

comment:2 Changed 4 years ago by ezyang

Sorry, use the GitHub repo:

git clone http://github.com/haskell/cabal.git

It also fails at tick factor 8192, and ran out of memory for me on the next power of two. Probably an infinite loop.

UPDATE: I just checked, both should work.

Last edited 4 years ago by ezyang (previous) (diff)

comment:3 in reply to:  2 Changed 4 years ago by hvr

Replying to ezyang:

UPDATE: I just checked, both should work.

that's because github's master branch is automatically mirrored to our git.haskell.org copy of Cabal

comment:4 Changed 4 years ago by slyfox

I wonder if #9565 is of the same cause (smaller source file, but nastier).

comment:5 Changed 4 years ago by ezyang

I've reduced the test case example. It appears to be related to the Generic binary implementation:

{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE DeriveGeneric #-}
{-# OPTIONS_GHC -O #-}

module Distribution.Version  where

import Data.Binary      ( Binary(..) )
import Data.Data        ( Data )
import Data.Typeable    ( Typeable )  
import GHC.Generics     ( Generic )

data T = A
       | B
       | C T
       | D T T
       | E T T
  deriving (Data, Generic, Typeable)

instance Binary T
Last edited 4 years ago by ezyang (previous) (diff)

comment:6 Changed 4 years ago by JohnWiegley

Simon and I had a look at this, and we found a few more data points:

  1. Manually implementing get and put for the instance, even if just to copy the default definitions, causes the test case above to pass.
  1. The following data type also reproduces the problem, removing even one of the T arguments to A passes:
data T = A T T T T T T T T T T T T | B T
  deriving (Data, Typeable, Generic)
  1. If you implement get for the instance, but not put, the error still occurs; but the converse works.
  1. If you set put = gdmput, and then define it in its own module, marking it INLINE works, but INLINEABLE fails:
gdmput :: (Generic t, GBinary (Rep t)) => t -> Put
gdmput = gput . from
{-# INLINABLE gdmput #-}
  1. Displaying inlining information with -ddump-inlinings -ddump-rule-firings shows that after the first few unfoldings, the following repeats continually up until the failure. Increasing the tick cause simply causes more of this occurrence to display:
Rule fired: Class op gput
Rule fired: Class op gput
Rule fired: Class op gput
Rule fired: Class op gput
Inlining done: Data.Binary.Generic.unTagged
Inlining done: Data.Binary.Generic.unTagged1
Inlining done: GHC.Base.id
Inlining done: GHC.Word.$fOrdWord64_$c<=
Inlining done: GHC.Word.$fNumWord64_$c-
Inlining done: Data.Binary.Generic.$fSumSizeM1_$csumSize
Inlining done: Data.Binary.Generic.$fSumSizeM1_$csumSize
Rule fired: plusWord#
Inlining done: GHC.Word.$fBitsWord64_$cfromInteger
Rule fired: integerToWord
Rule fired: minusWord#
Rule fired: leWord#
Rule fired: tagToEnum#
Inlining done: GHC.Word.$fBitsWord8_$cfromInteger
Rule fired: integerToWord
Rule fired: narrow8Word#
Inlining done: Data.Binary.Generic.$fSumSizeM1_$csumSize
Inlining done: Data.Binary.Generic.$fSumSizeM1_$csumSize
Rule fired: plusWord#
Inlining done: GHC.Base.id
Rule fired: narrow8Word#
Inlining done: GHC.Word.$fBitsWord8_$cshiftR
Rule fired: >=#
Rule fired: tagToEnum#
Rule fired: uncheckedShiftRL#
Rule fired: Class op putSum
Inlining done: Data.Binary.Put.$fApplicativePutM_$c*>
Inlining done: Data.Binary.Put.$fApplicativePutM2
Rule fired: Class op put
Inlining done: Data.Binary.Put.putWord8
Inlining done: GHC.Base.$
Inlining done: GHC.Base.returnIO
Inlining done: GHC.Base.returnIO1
Inlining done: GHC.Base.bindIO
Inlining done: GHC.Base.bindIO1
Inlining done: GHC.Base.flip
Inlining done: Foreign.Storable.$fStorableWord8_$cpoke
Inlining done: Foreign.Storable.$fStorableWord19
Rule fired: Class op gput
Inlining done: GHC.Word.$fNumWord8_$c+
Rule fired: plusWord#
Rule fired: narrow8Word#
Inlining done: GHC.Word.$fNumWord8_$c-
Rule fired: minusWord#
Rule fired: narrow8Word#
Rule fired: Class op putSum
Inlining done: Data.Binary.Put.$fApplicativePutM_$c*>
Inlining done: Data.Binary.Put.$fApplicativePutM2
Rule fired: Class op put
Inlining done: Data.Binary.Put.putWord8
Inlining done: GHC.Base.$
Inlining done: GHC.Base.returnIO
Inlining done: GHC.Base.returnIO1
Inlining done: GHC.Base.bindIO
Inlining done: GHC.Base.bindIO1
Inlining done: GHC.Base.flip
Inlining done: Foreign.Storable.$fStorableWord8_$cpoke
Inlining done: Foreign.Storable.$fStorableWord19
Last edited 4 years ago by JohnWiegley (previous) (diff)

comment:7 Changed 4 years ago by ezyang

I'm going to try and bisect this problem.

comment:8 Changed 4 years ago by ezyang

Bisection finished, the bad commit is this one:

b9e49d3e9580e13d89efd1f779cb76f610e0d6e0 is the first bad commit                       [4/1912]
commit b9e49d3e9580e13d89efd1f779cb76f610e0d6e0
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Tue May 13 13:10:26 2014 +0100

    Add -fspecialise-aggressively
    
    This flag specialises any imported overloaded function that has an
    unfolding, whether or not it was marked INLINEABLE.
    
    We get a lot of orphan SPEC rules as a result, but that doesn't matter
    provided we don't treat orphan auto-generated rules as causing the module
    itself to be an orphan module.  See Note [Orphans and auto-generated rules]
    in MkIface.

comment:9 Changed 4 years ago by simonpj

I thought this was going to be very complicated but it turned out to be very simple! The occurrence analyser does something called "glomming" if the application of imported RULES means that something that didn't look recursive becomes recursive. See Note [Glomming] in OccurAnal. Under these circumstances we group all the top-level bindings into a single massive Rec.

But, crucially, I failed to repeat the occurrence analysis on this glommed set of bindings. That means that we weren't establishing the right loop breakers (indeed there were no loop breakers whatsoever), and that led immediately to the loop. The only surprising this is that it didn't happen before.

Patch coming

Simon

comment:10 in reply to:  9 Changed 4 years ago by hvr

Replying to simonpj:

Patch coming

any ETA on that patch? It's blocking a few things right now... :-)

comment:11 Changed 4 years ago by Simon Peyton Jones <simonpj@…>

In 5fa6e75960b3dddbc72c35eb3fc0f2759215dfbb/ghc:

Ensure that loop breakers are computed when glomming

This patch fixes Trac #9583, a loop in the simplifier.

I thought this was going to be very complicated but it turned out to
be very simple!  The occurrence analyser does something called
"glomming" if the application of imported RULES means that something
that didn't look recursive becomes recursive.  See `Note [Glomming]`
in `OccurAnal`.  Under these circumstances we group all the top-level
bindings into a single massive `Rec`.

But, crucially, I failed to repeat the occurrence analysis on this
glommed set of bindings.  That means that we weren't establishing the
right loop breakers (indeed there were no loop breakers whatsoever),
and that led immediately to the loop. The only surprising this is that
it didn't happen before.

comment:12 Changed 4 years ago by simonpj

I've just pushed it. I'm getting a failure on perf/compiler/T783, but as usual I'm uncertain about whether the patch is the culprit or not. I suspect not, so since it's a blocker I'm pushing it anyway.

Simon

comment:13 Changed 4 years ago by Simon Peyton Jones <simonpj@…>

In 01906c7399301e4f69959ecbd3b0d8bee5d5ef70/ghc:

Test Trac #9565 and #9583

comment:14 Changed 4 years ago by simonpj

Cc: dreixel added

Perhaps someone can confirm that Cabal is ok, and then close.

Pedro: the "deriving Binary" stuff still generates a gargantuan quantity of code!

Simon

comment:15 in reply to:  12 Changed 4 years ago by hvr

Replying to simonpj:

I've just pushed it. I'm getting a failure on perf/compiler/T783, but as usual I'm uncertain about whether the patch is the culprit or not. I suspect not, so since it's a blocker I'm pushing it anyway.

fwiw, Phab:harbormaster/build/952 built just fine (but that's for Linux/x86_64 thresholds)

comment:16 in reply to:  14 Changed 4 years ago by hvr

Replying to simonpj:

Perhaps someone can confirm that Cabal is ok, and then close.

Pedro: the "deriving Binary" stuff still generates a gargantuan quantity of code!

Compiling Cabal within the GHC tree is not really ok (see comments in Phab:D183). It does compile, but it takes *several* minutes (somewhere beyond 7 minutes on the buildbot) as well as massive amount of memory (over 50% of the 4GiB mem available in the buildbot, causing it to fail w/ an out-of-mem exception since we build with CPUS=4) when trying to produce the libraries/Cabal/Cabal/dist-install/build/Language/Haskell/Extension.o module with ghc-stage1.

Last edited 4 years ago by hvr (previous) (diff)

comment:17 Changed 4 years ago by simonpj

Well this ticket was about a bug in GHC that meant it really didn't compile *at all*. Now, from what you say, it does. But there may still be a problem with the amount of code that deriving Binary generates (hence copying Pedro), perhaps due to nested tuples (which might be helped by the new SOP approach) or perhaps due to over-enthusiastic INLINE pragmas.

In any case, I don't think there's a bug in GHC any more. But I'll leave this ticket open in the hope that we may make progress on the deriving Binary question.

Simon

comment:18 Changed 4 years ago by hvr

I've created a new ticket for the performance regression over at #9630 to separate concerns a bit

comment:19 Changed 4 years ago by simonpj

Resolution: fixed
Status: newclosed

In which case we can close this one.

Note: See TracTickets for help on using tickets.