wiki:Performance/Compiler

Compiler performance

This is where we track various efforts to characterize and improve the performance of the compiler itself. If you are interested in the performance of code generated by GHC, see Performance/Runtime.

Relevant tickets

Identify tickets by using "Compile time performance bug" for the "Type of failure field".

Open Tickets:

#15176
Superclass `Monad m =>` makes program run 100 times slower
#15090
Do more coercion optimisation on the fly
#15019
Fix performance regressions from #14737
#14988
Memory strain while compiling HLint
#14974
2-fold memory usage regression GHC 8.2.2 -> GHC 8.4.1 compiling `mmark` package
#14944
Compile speed regression
#14923
Recompilation avoidance fails after a LANGUAGE change
#14854
The size of FastString table is suboptimal for large codebases
#14766
Hole-y partial type signatures greatly slow down compile times
#14741
High-memory usage during compilation using Template Haskell
#14738
Investigate performance of CoreTidy
#14594
2 modules / 2500LOC takes nearly 3 minutes to build
#14281
Minor regressions from removal of non-linear behavior from simplifier
#14165
Investigate regressions from simplifier refactor
#14037
Fix fusion for GHC's utility functions
#14031
Linker paths carry substantial N*M overhead when many libaries are used
#13993
Certain inter-module specializations run out of simplifier ticks
#13904
LLVM does not need to trash caller-saved registers.
#13745
Investigate compile-time regressions in regex-tdfa-1.2.2
#13744
Compile-time regression in 8.2 when compiling bloodhound's test suite
#13586
ghc --make seems to leak memory
#13565
Compiler allocations on sched in nofib regressed by 10% between 091333313 and 1883afb2
#13564
Why does memory usage increase so much during CoreTidy?
#13535
vector test suite uses excessive memory on GHC 8.2
#13426
compile-time memory-usage regression for DynFlags between GHC 8.0 and GHC 8.2
#13386
Poor compiler performance with type families
#13353
foldr/nil rule not applied consistently
#13282
Introduce fast path through simplifier for static bindings
#13279
Check known-key lists
#13270
Make Core Lint faster
#13253
Exponential compilation time with RWST & ReaderT stack with `-02`
#13226
Compiler allocation regressions from top-level string literal patch
#13092
family instance consistency checks are too pessimistic
#13063
Program uses 8GB of memory
#13048
Splitter is O(n^2)
#12896
Consider using compact regions in GHC itself to reduce GC overhead
#12860
GeneralizedNewtypeDeriving + MultiParamTypeClasses sends typechecker into an infinite loop
#12847
ghci -fobject-code -O2 doesn't do the same optimisations as ghc --make -O2
#12765
Don't optimize coercions with -O0
#12506
Compile time regression in GHC 8.
#12412
SIMD things introduce a metric ton of known key things
#12274
GHC panic: simplifier ticks exhausted
#12032
Performance regression with large numbers of equation-style decls
#12028
Large let bindings are 6x slower (since 6.12.x to 7.10.x)
#11822
Pattern match checker exceeded (2000000) iterations
#11735
Optimize coercionKind
#11545
Strictness signature blowup
#11528
Representation of value set abstractions as trees causes performance issues
#11380
Compiling a 10.000 line file exhausts memory
#11323
powerpc64: recomp015 fails with redundant linking
#11263
"Simplifier ticks exhausted" that resolves with fsimpl-tick-factor=200
#11260
Re-compilation driver/recomp11 test fails
#11196
TypeInType performance regressions
#11151
T3064 regresses with wildcard refactor
#10980
Deriving Read instance from datatype with N fields leads to N^2 code size growth
#10844
CallStack should not be inlined
#10818
GHC 7.10.2 takes much longer to compile some packages
#10584
Installation of SFML failed
#10228
Increased memory usage with GHC 7.10.1
#9979
Performance regression GHC 7.8.4 to GHC HEAD
#9780
dep_orphs in Dependencies redundantly records type family orphans
#9675
Unreasonable memory usage on large data structures
#9669
Long compile time/high memory usage for modules with many deriving clauses
#9557
Deriving instances is slow
#9370
unfolding info as seen when building a module depends on flags in a previously-compiled module
#9221
(super!) linear slowdown of parallel builds on 40 core machine
#9198
large performance regression in type checker speed in 7.8
#9020
Massive blowup of code size on trivial program
#8774
Transitivity of Auto-Specialization
#8731
long compilation time for module with large data type and partial record selectors
#8523
blowup in space/time for type checking and object size for high arity tuples
#8211
ghc -c recompiles TH every time while --make doesn't
#8173
GHC uses nub
#8147
Exponential behavior in instance resolution on fixpoint-of-sum
#8144
Interface hashes include time stamp of dependent files (UsageFile mtime)
#8095
TypeFamilies painfully slow
#7803
Superclass methods are left unspecialized
#7450
Regression in optimisation time of functions with many patterns (6.12 to 7.4)?
#7428
GHC compile times are seriously non-linear in program size
#7258
Compiling DynFlags is jolly slow
#6047
GHC retains unnecessary binding
#5642
Deriving Generic of a big type takes a long time and lots of space
#5224
Improve consistency checking for family instances
#3831
SpecConstr should exploit cases where there is exactly one call pattern
#2988
Improve float-in
#2346
Compilation of large source files requires a lot of RAM
#1290
ghc runs preprocessor too much

Closed Tickets:

#15164
Slowdown in ghc compile times from GHC 8.0.2 to GHC 8.2.1 when doing Called arity analysis
#14969
Underconstrained typed holes are non-performant
#14928
TH eats 50 GB memory when creating ADT with multiple constructors
#14737
Improve performance of Simplify.simplCast
#14723
GHC 8.4.1-alpha loops infinitely when typechecking
#14697
Redundant computation in fingerprintDynFlags when compiling many modules
#14693
Computing imp_finst can take up significant amount of time
#14683
Slow compile times for Happy-generated source
#14667
Compiling a function with a lot of alternatives bottlenecks on insertIntHeap
#14657
Quadratic constructor tag allocation
#14450
GHCi spins forever
#14435
GHC 8.2.1 regression: -ddump-tc-trace hangs forever
#14378
Unreasonably high memory use when compiling with profiling and -O2/-O2
#14254
The Binary instance for TypeRep smells a bit expensive
#14161
Performance Problems on AST Dump
#13789
Look into haddock performance regressions due to desugaring on -fno-code
#13719
checkFamInstConsistency dominates compile time
#13701
GHCi 2x slower without -keep-tmp-files
#13659
Bug report: "AThing evaluated unexpectedly tcTyVar a_alF"
#13639
Skylighting package compilation is glacial
#13395
3x slowdown on GHC HEAD with file containing lots of overloaded string literals
#13379
Space leak / quadratic behavior when inlining
#13344
Core string literal patch regresses compiler performance considerably
#13188
COMPLETE pragma causes compilation to hang forever under certain scenarios
#13081
Code size explosion with with inlined instances for fixed point of functor
#13059
High memory usage during compilation
#13056
Deriving Foldable causes GHC to take a long time (GHC 8.0 ONLY)
#12878
Use gold linker by default if available on ELF systems
#12790
GHC 8.0.1 uses copious amounts of RAM and time when trying to compile lambdabot-haskell-plugins
#12754
Adding an explicit export list halves compilation time.
#12567
`ghc --make` recompiles unchanged files when using `-fplugin` OPTIONS
#12545
Compilation time/space regression in GHC 8.0/8.1 (search in type-level lists and -O)
#12425
With -O1 and above causes ghc to use all available memory before being killed by OOM killer
#12367
Commit adding instances to GHC.Generics regression compiler performance
#12357
Increasing maximum constraint tuple size significantly blows up compiler allocations
#12234
'deriving Eq' on recursive datatype makes ghc eat a lot of CPU and RAM
#12227
regression: out of memory with -O2 -ddump-hi on a complex INLINE function
#12191
7% allocation regression in Haddock performance tests
#12150
Compile time performance degradation on code that uses undefined/error with CallStacks
#11991
Generics deriving is quadratic
#11800
T9872d bytes allocated has regressed terribly on 32-bit Linux
#11598
Cache coercion kinds and roles
#11597
Optimize cmpTypeX
#11518
Test TcCoercibleFail hangs with substitution sanity checks enabled
#11443
SPECIALIZE pragma does not work + compilation times regression in GHC 8.0-rc1
#11415
pandoc-types fails to build on 4 GB machine
#11407
-XTypeInType uses up all memory when used in data family instance
#11379
Solver hits iteration limit in code without recursive constraints
#11375
Type aliases twice as slow to compile as closed type families.
#11303
Pattern matching against sets of strings sharing a prefix blows up pattern checker
#11285
Split objects makes static linking really slow
#11163
New exhaustiveness checker breaks T5642
#11162
T783 regresses severely in allocations with new pattern match checker
#11161
New exhaustiveness checker breaks concurrent/prog001
#11160
New exhaustiveness checker breaks ghcirun004
#11095
-O0 -g slows GHC down on list literals (compared to -O0 without -g)
#11074
invalid fixup in runtime linker
#11030
D757 (emit Typeable at type definition site) regresses T3294 max_bytes_used by factor of two
#10858
Smaller generated Ord instances
#10852
ghc 7.8.4 on arm - panic: Simplifier ticks exhausted
#10837
Constant-time indexing of closed type family axioms
#10800
vector-0.11 compile time increased substantially with 7.10.1
#10711
Defining mapM_ in terms of traverse_ causes substantial blow-up in ByteCodeAsm
#10693
Profile ghc -j with an eye for performance issues
#10528
compile time performance regression with OverloadedStrings and Text
#10491
Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround
#10370
Compile time regression in OpenGLRaw
#10293
CallArity taking 20% of compile time
#10289
compiling huge HashSet hogs memory
#9961
compile-time performance regression compiling genprimcode
#9960
Performance problem with TrieMap
#9771
Excessive memory usage compiling T3064
#9630
compile-time performance regression (probably due to Generics)
#9400
poor performance when compiling modules with many Text literals at -O1
#9243
Recompilation avoidance doesn't work for -fno-code/-fwrite-interface
#9233
Compiler performance regression
#9229
Compiler memory use regression
#9077
Forcing the type to be IO {} instead of IO() causes a "panic! The impossible has happened" output.
#9073
small SPECIALIZE INLINE program taking gigabytes of memory to compile
#8962
compile hang and memory blowup when using profiling and optimization
#8852
7.8.1 uses a lot of memory when compiling attoparsec programs using <|>
#8691
Investigate recent 32bit compiler performance regressions
#8654
Exponential-long compilation of code with Implicit params
#8229
Linking in Windows is slow
#8174
GHC should not load packages for TH if they are not used
#7960
Compiling profiling CCS registration .c file takes far too long
#7847
Maintain per-generation lists of weak pointers
#7846
GHC 7.7 cannot link primitives
#7702
Memory Leak in CoreM (CoreWriter)
#7637
split-objs not supported for ARM
#7414
plugins always trigger recompilation
#7286
GHC doesn't optimise away primitive identity conversions
#7231
GHCi erroneously unloads modules after a failed :reload
#7198
New codegen more than doubles compile time of T3294
#7068
Extensive Memory usage (regression)
#6104
Regression: space leak in HEAD vs. 7.4
#5981
quadratic slowdown with very long module names
#5970
Type checker hangs
#5905
ghc with incorrect arguments deletes source file
#5652
T3016 takes long time to compile with LLVM
#5631
Compilation slowdown from 7.0.x to 7.2.x
#5522
mc03 -O -fliberate-case -fspec-constr runs out of memory
#5352
Very slow (nonterminating?) compilation if libraries compiled with -fexpose-all-unfoldings
#5321
Very slow constraint solving for type families
#5284
Simplifier performance regression (or infinite loop)
#5271
Compilation speed regression
#5156
New codegen: CmmStackLayout igraph memory explosion
#5102
ghc struggles to compile a large case statement
#5030
Slow type checking of type-level computation heavy code.
#4856
Performance regression in the type checker regression for GADTs and type families
#4849
object code size fairly large for ghc-7.0.1 with optimization
#4838
LLVM mangler takes too long at runtime
#4528
stand-alone deriving sometimes fails for GADTs
#4435
T3016 failed with timeout (hpc and optasm)
#4434
barton-mangler-bug failed with timeout (multiple ways)
#4421
Compilation performance regression
#4367
Compiler space regression in 7.0.1 RC 1
#4324
Template Haskell: Splicing Infinite Syntax Tree doesn't stop
#4235
deriving Enum fails for data instances
#4029
ghci leaks memory when loading a file
#3972
ghc 6.12.1 and 6.13.20090922 consume a lot more memory than 6.10.4 when compiling language-python package
#3897
reading a large String as Double takes too long
#3829
GHC leaks memory when compiling many files
#3796
GHC 6.12 dependency checking many times slower than 6.10
#3664
Ghc eats tremendous heaps of RAM in -prof build (highlighting-kate)
#3629
Code compiled WITHOUT profiling many times slower than compiled WITH profiling on
#3294
Large compilation time/memory consumption
#3064
Very long compile times with type functions
#2859
Reduce coercion terms to normal form
#2762
Excessive heap usage
#2680
Type-checking performance regression
#2609
Compiling with -O2 is 7x slower than -O
#2438
memory performance problem when compiling lots of derived instances in a single file
#2328
Compiling DoCon with 6.8.3 has 3x slow-down compared with 6.8.2
#2159
Use a more efficient representation than [DynFlag]
#2089
reading the package db is slow
#2002
problems with very large (list) literals
#1969
enormous compile times
#1875
Compiling with -O is 30 times slower than with -Onot
#1747
debugger: :trace is wasting time
#1136
High memory use when compiling many let bindings.


Type pile-up

Some programs can produce very deeply nested types of non-linear size. See Scrap your type applications for a way to improve these bad cases

  • #9198: large performance regression in type checker speed in 7.8
    • Types in Core blowing up quadratically (as seen in -ddump-ds output)

Coercion pile-up

One theme that seems to pop up rather often is the production of Core with long strings of coercions, with the size scaling non-linearly with the size of the types in the source program. These may or may not be due to similar root-causes.

  • #8095: TypeFamilies painfully slow
    • Here a recursive type family instance leads to quadratic blow-up of coercions
    This ticket has a discussion about a way to snip off coercions when not using -dcore-lint.
  • #7428: GHC compile times are seriously non-linear in program size
    • Here a CPS'd State monad is leading to a quadratic blowup in Core size over successive simplifier iterations
  • #5642: Deriving Generic of a big type takes a long time and lots of space
  • #14338: Simplifier fails with "Simplifier ticks exhausted"
    • Specialised dictionaries parametrized on a type-level list produce very large coercions.

One possible solution (proposed in #8095) is to eliminate coercions from the Core AST during usual compilation, instead only including them when we want to lint the Core.

Deriving instances

Another theme often seen is issues characterized by perceived slowness during compilation of code deriving instances. This could be due to a number of reasons,

  1. the implementation of the logic responsible for producing the instance code is inefficient
  2. the instance itself is large but could be expressed more concisely
  3. the instance itself is large but irreducibly so

While it's possible to fix (1) and (2), (3) is inherent.

Uncategorised compiler performance issues

  • #2346: desugaring let-bindings
  • #10228: increase in compiler memory usage, regression from 7.8.4 to 7.10.1
  • #10289: 2.5k static HashSet takes too much memory to compile
    • Significantly improved in memory usage from #10370, but worse at overall wall-clock time!
  • #7450: Regression in optimisation time of functions with many patterns (6.12 to 7.4)?
  • #10800: vector-0.11 compile time increased substantially with 7.10.1
    • Regression in vector testsuite perhaps due to change in inlinings
  • #13639: Skylighting package compilation is glacial

nofib results

tests/perf/compiler results

7.6 vs 7.8

  • A bit difficult to decipher, since a lot of the stats/surrounding numbers were totally rewritten due to some Testsuite API overhauls.
  • The results are a mix; there are things like peak_megabytes_allocated being bumped up a lot, but a lot of them also had bytes_allocated go down as well. This one seems pretty mixed.

7.8 vs 7.10

  • Things mostly got better according to these, not worse!
  • Many of them had drops in bytes_allocated, for example, T4801.
  • The average improvement range is something like 1-3%.
  • But one got much worse; T5837's bytes_allocated jumped from 45520936 to 115905208, 2.5x worse!

7.10 vs HEAD

  • Most results actually got better, not worse!
  • Silent superclasses made HEAD drop in several places, some noticeably over 2x
    • max_bytes_used increased in some cases, but not much, probably GC wibbles.
  • No major regressions, mostly wibbles.

Compile/build times

(NB: Sporadically updated)

As of April 22nd, 2016:

  • GHC HEAD: 14m9s (via 7.8.3) (because of Joachim's call-arity improvements)
  • GHC 7.10: 15m43s (via 7.8.3)
  • GHC 7.8: 12m54s (via 7.8.3)
  • GHC 7.6: 8m19s (via 7.4.1)

Random note: GHC 7.10's build system actually disabled DPH (half a dozen more packages and probably a hundred extra modules), yet things *still* got slower over time!

Interesting third-party library numbers

  • Compile time of some example program (fluid-tree) of fltkhs library increased from about 15 seconds to more than a minute (original message).
  • GHC takes significantly more memory compiling the xmlhtml library with -j4 than -j1 (1GB vs 150MB). See #9370.
  • The Language.Haskell.Exts.Annotated.Syntax of haskell-src-exts takes many tens of seconds to compile. Howeever, this may not be surprising: Consists of roughly 70 data definitions, some with many constructors, deriving (Eq,Ord,Show,Typeable,Data,Foldable,Traversable) on most of them as well as defining Functor.
  • vector-algorithms may be a nice test and reportedly got slower to compile and run in recent GHC releases.

Relevant changes

GHC 7.10 to GHC 8.0

GHC 8.0 to GHC 8.2

GHC 8.2 to GHC 8.4

Last modified 3 months ago Last modified on Apr 4, 2018 9:41:01 AM