Version 34 (modified by bgamari, 11 days ago) (diff)


Compiler performance

This is where we track various efforts to characterize and improve the performance of the compiler itself. If you are interested in the performance of code generated by GHC, see Performance/Runtime.

Relevant tickets

Identify tickets by using "Compile time performance bug" for the "Type of failure field".

Here's a list:!closed&failure=Compile-time+performance+bug

Type pile-up

Some programs can produce very deeply nested types of non-linear size. See Scrap your type applications for a way to improve these bad cases

  • #9198: large performance regression in type checker speed in 7.8
    • Types in Core blowing up quadratically (as seen in -ddump-ds output)

Coercion pile-up

One theme that seems to pop up rather often is the production of Core with long strings of coercions, with the size scaling non-linearly with the size of the types in the source program. These may or may not be due to similar root-causes.

  • #8095: TypeFamilies painfully slow
    • Here a recursive type family instance leads to quadratic blow-up of coercions
    This ticket has a discussion about a way to snip off coercions when not using -dcore-lint.
  • #7428: GHC compile times are seriously non-linear in program size
    • Here a CPS'd State monad is leading to a quadratic blowup in Core size over successive simplifier iterations
  • #5642: Deriving Generic of a big type takes a long time and lots of space

One possible solution (proposed in #8095) is to eliminate coercions from the Core AST during usual compilation, instead only including them when we want to lint the Core.

Deriving instances

Another theme often seen is issues characterized by perceived slowness during compilation of code deriving instances. This could be due to a number of reasons,

  1. the implementation of the logic responsible for producing the instance code is inefficient
  2. the instance itself is large but could be expressed more concisely
  3. the instance itself is large but irreducibly so

While it's possible to fix (1) and (2), (3) is inherent.

Uncategorised compiler performance issues

  • #2346: desugaring let-bindings
  • #10228: increase in compiler memory usage, regression from 7.8.4 to 7.10.1
  • #10289: 2.5k static HashSet takes too much memory to compile
    • Significantly improved in memory usage from #10370, but worse at overall wall-clock time!
  • #7450: Regression in optimisation time of functions with many patterns (6.12 to 7.4)?
  • #10800: vector-0.11 compile time increased substantially with 7.10.1
    • Regression in vector testsuite perhaps due to change in inlinings
  • #13639: Skylighting package compilation is glacial

nofib results

tests/perf/compiler results

7.6 vs 7.8

  • A bit difficult to decipher, since a lot of the stats/surrounding numbers were totally rewritten due to some Testsuite API overhauls.
  • The results are a mix; there are things like peak_megabytes_allocated being bumped up a lot, but a lot of them also had bytes_allocated go down as well. This one seems pretty mixed.

7.8 vs 7.10

  • Things mostly got better according to these, not worse!
  • Many of them had drops in bytes_allocated, for example, T4801.
  • The average improvement range is something like 1-3%.
  • But one got much worse; T5837's bytes_allocated jumped from 45520936 to 115905208, 2.5x worse!

7.10 vs HEAD

  • Most results actually got better, not worse!
  • Silent superclasses made HEAD drop in several places, some noticeably over 2x
    • max_bytes_used increased in some cases, but not much, probably GC wibbles.
  • No major regressions, mostly wibbles.

Compile/build times

(NB: Sporadically updated)

As of April 22nd, 2016:

  • GHC HEAD: 14m9s (via 7.8.3) (because of Joachim's call-arity improvements)
  • GHC 7.10: 15m43s (via 7.8.3)
  • GHC 7.8: 12m54s (via 7.8.3)
  • GHC 7.6: 8m19s (via 7.4.1)

Random note: GHC 7.10's build system actually disabled DPH (half a dozen more packages and probably a hundred extra modules), yet things *still* got slower over time!

Interesting third-party library numbers

  • Compile time of some example program (fluid-tree) of fltkhs library increased from about 15 seconds to more than a minute (original message).
  • GHC takes significantly more memory compiling the xmlhtml library with -j4 than -j1 (1GB vs 150MB). See #9370.
  • The Language.Haskell.Exts.Annotated.Syntax of haskell-src-exts takes many tens of seconds to compile. Howeever, this may not be surprising: Consists of roughly 70 data definitions, some with many constructors, deriving (Eq,Ord,Show,Typeable,Data,Foldable,Traversable) on most of them as well as defining Functor.
  • vector-algorithms may be a nice test and reportedly got slower to compile and run in recent GHC releases.

Relevant changes

GHC 7.10 to GHC 8.0

GHC 8.0 to GHC 8.2