Opened 3 years ago

Closed 3 years ago

#5284 closed bug (fixed)

Simplifier performance regression (or infinite loop)

Reported by: simonmar Owned by: igloo
Priority: high Milestone: 7.4.1
Component: Compiler Version: 7.0.3
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Difficulty:
Test Case: Blocked By:
Blocking: Related Tickets:

Description

Compiling simplCore/should_compile/T3016.hs with -O does not finish with 7.1.20110627 (I waited over 10 minutes), but it completes in 30s with 7.0.3.

Without -O it compiles fine, in about the same time as 7.0.3.

Marking as "highest" because this is a regression.

Change History (8)

comment:1 Changed 3 years ago by simonpj

That's odd. It seems ok now. Can you try again?

bash$ wc T3016.hs 
   584   3486 174956 T3016.hs

bash$ time ~/builds/validate-HEAD/inplace/bin/ghc-stage2 -c -O T3016.hs  -fforce-recomp +RTS -s 
  13,862,317,264 bytes allocated in the heap
   8,584,884,064 bytes copied during GC
     123,304,256 bytes maximum residency (54 sample(s))
       2,311,912 bytes maximum slop
             325 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     26462 colls,     0 par   12.39s   12.40s     0.0005s    0.0113s
  Gen  1        54 colls,     0 par    9.54s    9.55s     0.1769s    0.6452s

  Parallel GC work balance: -nan (0 / 0, ideal 1)

                        MUT time (elapsed)       GC time  (elapsed)
  Task  0 (worker) :    0.00s    ( 41.86s)       0.00s    (  0.00s)
  Task  1 (worker) :    0.00s    ( 41.86s)       0.00s    (  0.00s)
  Task  2 (bound)  :   16.94s    ( 19.91s)      21.92s    ( 21.95s)

  SPARKS: 0 (0 converted, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time   16.94s  ( 19.90s elapsed)
  GC      time   21.93s  ( 21.95s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   38.87s  ( 41.86s elapsed)

  Alloc rate    818,245,673 bytes per MUT second

  Productivity  43.6% of total user, 40.5% of total elapsed

gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0

real	0m41.927s
user	0m40.970s
sys	0m0.920s

It's not great, though. The module is about 600 lines of code with two very big integer constants on each line. These are all re-expressed as products of reasonable-sized integers. So just one line of the file expands to the code below.

This is a bit stupid. We should just have a function

mkInteger :: [Int] -> Integer
mkInteger xs = go 1 xs
  where
    go :: Integer -> [Int] -> Integer
    go acc [] = acc
    go acc (x:xs) = go (acc*maxInt + x) xs

and then convert big Integer literals into mkInteger [p,q,r] where the [p,q,r] are the coefficients. That would generate far less code.

Small literals should use the old route, so that constant folding has a chance of happening.

Would anyone like to do this? The relevant code is in MkCore.mkIntegerExpr.

a_s7Y0 :: T3016.P
[LclId,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
         ConLike=False, Cheap=False, Expandable=False, Guidance=NEVER}]
a_s7Y0 =
  case GHC.Integer.plusInteger
         (GHC.Integer.Type.S# 790894860564361520)
         (GHC.Integer.timesInteger
            (GHC.Integer.plusInteger
               (GHC.Integer.Type.S# 5694277513942979484)
               (GHC.Integer.timesInteger
                  (GHC.Integer.plusInteger
                     (GHC.Integer.Type.S# 8695760837991935493)
                     (GHC.Integer.timesInteger
                        (GHC.Integer.plusInteger
                           (GHC.Integer.Type.S# 3111745589992438195)
                           (GHC.Integer.timesInteger
                              (GHC.Integer.plusInteger
                                 (GHC.Integer.Type.S# 5932496986262705997)
                                 (GHC.Integer.timesInteger
                                    (GHC.Integer.plusInteger
                                       (GHC.Integer.Type.S# 2467237890019628039)
                                       (GHC.Integer.timesInteger
                                          (GHC.Integer.plusInteger
                                             (GHC.Integer.Type.S# 728563175101493569)
                                             (GHC.Integer.timesInteger
                                                (GHC.Integer.plusInteger
                                                   (GHC.Integer.Type.S# 6858788324311301629)
                                                   (GHC.Integer.timesInteger
                                                      (GHC.Integer.Type.S# 2273139894617598453)
                                                      (GHC.Integer.Type.S# 9223372036854775807)))
                                                (GHC.Integer.Type.S# 9223372036854775807)))
                                          (GHC.Integer.Type.S# 9223372036854775807)))
                                    (GHC.Integer.Type.S# 9223372036854775807)))
                              (GHC.Integer.Type.S# 9223372036854775807)))
                        (GHC.Integer.Type.S# 9223372036854775807)))
                  (GHC.Integer.Type.S# 9223372036854775807)))
            (GHC.Integer.Type.S# 9223372036854775807))
  of nt_s80m [Dmd=Just L] { __DEFAULT ->
  case GHC.Integer.plusInteger
         (GHC.Integer.Type.S# 1951866921918820082)
         (GHC.Integer.timesInteger
            (GHC.Integer.plusInteger
               (GHC.Integer.Type.S# 455876676931171453)
               (GHC.Integer.timesInteger
                  (GHC.Integer.plusInteger
                     (GHC.Integer.Type.S# 1769193696036972708)
                     (GHC.Integer.timesInteger
                        (GHC.Integer.plusInteger
                           (GHC.Integer.Type.S# 2753606951986561079)
                           (GHC.Integer.timesInteger
                              (GHC.Integer.plusInteger
                                 (GHC.Integer.Type.S# 2915216215365162184)
                                 (GHC.Integer.timesInteger
                                    (GHC.Integer.plusInteger
                                       (GHC.Integer.Type.S# 1817063239677662611)
                                       (GHC.Integer.timesInteger
                                          (GHC.Integer.plusInteger
                                             (GHC.Integer.Type.S# 5655233183732864)
                                             (GHC.Integer.timesInteger
                                                (GHC.Integer.plusInteger
                                                   (GHC.Integer.Type.S# 8781166386102664098)
                                                   (GHC.Integer.timesInteger
                                                      (GHC.Integer.plusInteger
                                                         (GHC.Integer.Type.S# 5376807536228193618)
                                                         (GHC.Integer.timesInteger
                                                            (GHC.Integer.Type.S# 8)
                                                            (GHC.Integer.Type.S#
                                                               9223372036854775807)))
                                                      (GHC.Integer.Type.S# 9223372036854775807)))
                                                (GHC.Integer.Type.S# 9223372036854775807)))
                                          (GHC.Integer.Type.S# 9223372036854775807)))
                                    (GHC.Integer.Type.S# 9223372036854775807)))
                              (GHC.Integer.Type.S# 9223372036854775807)))
                        (GHC.Integer.Type.S# 9223372036854775807)))
                  (GHC.Integer.Type.S# 9223372036854775807)))
            (GHC.Integer.Type.S# 9223372036854775807))
  of nt_s80o [Dmd=Just L] { __DEFAULT ->
  T3016.NZ
    (nt_s80m
     `cast` (Sym (T3016.NTCo:F) :: GHC.Integer.Type.Integer ~ T3016.F))
    (nt_s80o
     `cast` (Sym (T3016.NTCo:F) :: GHC.Integer.Type.Integer ~ T3016.F))
  }
  }

comment:2 Changed 3 years ago by simonmar

  • Owner set to igloo

comment:3 Changed 3 years ago by igloo

  • Milestone changed from 7.2.1 to 7.4.1
  • Priority changed from highest to high

The code explosion part is fixed, and merged to 7.2.

comment:4 Changed 3 years ago by simonmar

Commit that fixed the bug? Also, I think we should make a separate ticket for better compilation of large Integer literals.

comment:5 Changed 3 years ago by igloo

In 7.2:

commit 9506f95ac32c91a9fa6ce659a4474d6f85e55268
Author: Ian Lynagh <igloo@earth.li>
Date:   Sat Jul 23 17:56:40 2011 +0100

    Don't inline most integer operations
    
    We get lots of code, and the simplifier generally can't do much with
    it. We'll instead use builtin rules to do constant folding where
    possible.

It's changeset:94df30b07c20c0031c1a8cc57ee3479423b963ac in the HEAD.

comment:6 Changed 3 years ago by simonpj

There's another ticket about literals: #5152

comment:7 Changed 3 years ago by simonmar

Simon and I had a chat about the representation of Integer literals today. Here's our proposed plan, which Ian has agreed to look into implementing.

  • small Integer literals will continue to be represented by explicit S# constructors in Core
  • large Integer literals will turn into mkInteger <len> "<bytes>"# where <bytes> is the binary representation of the big int (choosing endianness etc. appropriately to be as close as possible to the GMP representation), and <len> is the number of bytes (multiplied by -1 if the literal is negative, as in GMP). The implementation of mkInteger in integer-gmp creates a J#, hopefully with little fuss - just allocating a ByteArray# and copying the bytes. Perhaps some byte-swapping within each word will be necessary. The point is that this is a very compact representation of Integer literals, that is also amenable to constant folding.
  • To fix #5152, we have a built-in rule for fromInteger at Word64, recognising a call to mkInteger in its argument (and possibly the other integral types, replacing the current typechecker magic?)
  • We could also add constant folding for large integers by recognising mkInteger in the other builtin Integer rules.

comment:8 Changed 3 years ago by igloo

  • Resolution set to fixed
  • Status changed from new to closed

The plan evolved a bit, but this is now essentially done, including constant folding for larger Integers.

Note: See TracTickets for help on using tickets.