Version 19 (modified by igloo, 5 years ago) (diff)


Dynamic by default

Currently, GHCi doesn't use the system linker to load libraries, but instead uses our own "GHCi linker". Unfortunately, this is a large blob of unpleasant code that we need to maintain, it already contains a number of known bugs, and new problem have a tendency to arise as new versions of OSes are released. We are therefore keen to get rid of it!

Even removing it only on particular OSes, arches, or OS/arch pairs would be useful, as much of the code is used only for a particular platform. However, the best outcome would be to remove it on all platforms, as that would allow us to simplify a lot more code.

Our solution is to switch GHCi from using the "static way", to using the "dynamic way". GHCi will then use the system linker to load the .dll for the library, rather than using the GHCi linker to load the .a.

For this to work, there is technically no need to change anything else: ghc could continue to compile for the static way by default. However, there are 2 problems that arise:

  1. cabal-install would need to install libraries not only for the static way (for use by ghc), but also for the dynamic way (for use by ghci). This would double library installation times and disk usage.
  2. GHCi would no longer be able to load modules compiled with ghc -c.


As well as the ticket for implementing dynamic-by-default (#3658), the table below lists the related tickets and the platforms that they affect. Most, if not all, of these would be immediately fixed by switching to dynamic-by-default.

TicketAffects OS X x86_64?Affects OS X x86?Affects Linux x86_64?Affects Linux x86?Affects Windows x86_64?Affects Windows x86?Affects other platforms?
#781 GHCi on x86_64, cannot link to static data in shared libsUnknownUnknownUnknownUnknownUnknownUnknownUnknown
#1883 GHC can't find library using "short" namenonononoYESYESno
#2283 WIndows: loading objects that refer to DLL symbolsnonononoYESYESno
#3242 ghci: can't load .so/.DLL for: m (addDLL: could not load DLL)nonononoYESYESno
#3654 Mach-O GHCi linker lacks support for a range of relocation entriesYESYESnonononono
#4244 Use system linker in GHCi to support alpha, ia64, ppc64nonononononoYES
#5062 Patch: Debug output for OS X linker and coding standard upgradesYESYESnonononono
#5197 Support static linker semantics for archives and weak symbolsYESYESYESYESYESYESYES
#5435 GHCi linker should run constructors for linked librariesYESYESYESYESYESYESYES
#6107 GHCi runtime linker cannot link with duplicate common symbolsYESYESYESYESYESYESYES
#7043 32-bit GHC ceiling of negative float SEGFAULT: 11UnknownUnknownUnknownUnknownUnknownUnknownUnknown
#7056 GHCi loadArchive "libiconv.a":failed Unknown PEi386 section name `.drectve'nonononoprobablyYESno
#7072 GHC interpreter does not find stat64 symbol on LinuxUnknownUnknownUnknownUnknownUnknownUnknownUnknown
#7097 linker fails to load package with binding to foreign librarynonononoprobablyYESno
#7103 Compiler panic, when loading wxc in GHCinonononoprobablyYESno
#7134 ghc- -> internal error R_X86_64_PC32nonononoYESnono
#7207 linker fails to load package with binding to foreign library (win64)nonononoYESnono
#7299 threadDelay broken in ghci, Mac OS XYESYESnonononono
#7357 GHC.exe gives an internal error while linking vector's Monadic.hsnonononoYESnono


Full nofib results showing the effect of switching to dynamic-by-default are available for OS X x86_64, OS X x86, Linux x86_64 and Linux x86. There is also a table of the highlights below. In summary:

Binary sizes are way down across the board, as we are now dynamically linking to the libraries.

Things are rosiest on OS X x86_64. On this platform, -fPIC is always on, so using dynamic libraries doesn't mean giving up a register for PIC. Overall, performance is a few percent better with dynamic by default.

On OS X x86, the situation is not so nice. On x86 we are very short on registers, and giving up another for PIC means we end up around 15% down on performance.

On Linux x86_64 we have more registers, so the effect of giving one up for PIC isn't so pronounced, but we still lose a few percent performance overall.

For unknown reasons, x86 Linux suffers even worse than x86 OS X, with around a 30% performance penalty.

static -> dynamic
on OS X x86_64
static -> dynamic
on OS X x86
static -> dynamic
on Linux x86_64
static -> dynamic
on Linux x86
Binary Sizes
-1 s.d.-95.8%-95.8%-95.8%-95.9%
+1 s.d.-93.1%-92.8%-92.6%-92.4%
Run Time
-1 s.d.-1.2%+11.7%-2.5%+16.6%
+1 s.d.+1.6%+20.0%+9.6%+40.3%
Elapsed Time
-1 s.d.-6.9%+10.3%-2.5%+16.6%
+1 s.d.-0.3%+20.4%+9.6%+40.3%
Mutator Time
-1 s.d.-1.3%+8.9%-5.0%+18.3%
+1 s.d.+1.9%+18.3%+7.5%+46.8%
Mutator Elapsed Time
-1 s.d.-4.5%+7.7%-5.0%+18.3%
+1 s.d.+0.3%+18.8%+7.5%+46.8%
GC Time
-1 s.d.-1.4%+16.3%+5.6%+13.4%
+1 s.d.+1.8%+27.1%+11.2%+24.0%
GC Elapsed Time
-1 s.d.-1.5%+15.8%+5.6%+13.4%
+1 s.d.+1.3%+25.6%+11.2%+24.0%
Compile Times
-1 s.d.-11.7%+6.2%-1.8%+27.0%
+1 s.d.-0.5%+18.2%+7.8%+37.8%

OS X x86 vs x86_64

Currently, some people use the x86 version of GHC on OS X for performance reasons. It's not clear for how much longer this will be viable, as other OS X libraries start dropping x86 support.

Full nofib results comparing the two are here for static by default, and here for dynamic by default, but the highlights are in the table below.

The left-hand column shows the status quo: x86_64 only beats x86 in mutator time, and that is a shallow victory as the higher GC time means that total runtime is worse for x86_64.

The right-hand column shows what the situation would be if we switch to dynamic instead. Allocations, memory use etc remain higher due to all word-sized things being twice as big. However, the combination of x86_64's performance improving, and x86's performance getting worse, means that x86_64 is now faster overall.

x86 -> x86_64
when static by default
x86 -> x86_64
when dynamic by default
Binary Sizes
-1 s.d.+38.0%+7.4%
+1 s.d.+38.6%+30.6%
-1 s.d.+63.2%+63.2%
+1 s.d.+114.4%+114.4%
Run Time
-1 s.d.-23.5%-31.6%
+1 s.d.+36.1%+14.7%
Elapsed Time
-1 s.d.-18.2%-30.0%
+1 s.d.+40.1%+17.0%
Mutator Time
-1 s.d.-32.4%-38.8%
+1 s.d.+20.1%+3.0%
Mutator Elapsed Time
-1 s.d.-28.7%-37.9%
+1 s.d.+22.5%+4.4%
GC Time
-1 s.d.+4.5%-11.9%
+1 s.d.+74.8%+54.1%
GC Elapsed Time
-1 s.d.+7.9%-8.0%
+1 s.d.+75.1%+56.7%
Total Memory in use
-1 s.d.-1.7%-1.9%
+1 s.d.+88.9%+88.9%
Compile Times
-1 s.d.+11.9%-8.9%
+1 s.d.+21.1%+2.9%