We no longer plan to take this approach. See DynamicGhcPrograms instead.
In particular, DYNAMIC_BY_DEFAULT has likely bitrotted: it does not seem possible to build with that flag on anymore (tested on Linux on a48464a7d2858bad28cfd1f393e82589825e62db). It still seems possible to manually toggle DYNAMIC_BY_DEFAULT
in lib/ghc-*/platformConstants
, but this is probably considered an unsupported feature.
Dynamic by default
Currently, GHCi doesn't use the system linker to load libraries, but instead uses our own "GHCi linker". Unfortunately, this is a large blob of unpleasant code that we need to maintain, it already contains a number of known bugs, and new problem have a tendency to arise as new versions of OSes are released. We are therefore keen to get rid of it!
There is some benefit (in terms of both bugs fixed and code removed) to removing it for a particular platform (e.g. Linux(elf)/x86), more benefit to removing it for a particular OS (e.g. Linux(elf)), but the most benefit is gained by removing it entirely.
Our solution is to switch GHCi from using the "static way", to using the "dynamic way". GHCi will then use the system linker to load the .dll
for the library, rather than using the GHCi linker to load the .a
.
(See #3658 for related design decisions etc.)
For this to work, there is technically no need to change anything else: ghc could continue to compile for the static way by default. However, there are 2 problems that arise:
- cabal-install would need to install libraries not only for the static way (for use by ghc), but also for the dynamic way (for use by ghci). This would double library installation times and disk usage.
- GHCi would no longer be able to load modules compiled with
ghc -c
. This would violate the principle of least surprise, and would make it harder to work around GHCi's limitations (such as performance, and lack of support for unboxed tuples).
Given these 2 issues, we think that if making GHCi use dynamic libraries, we should also make ghc compile the "dynamic way" by default.
Current status
Unix-like platforms
We have everything working, although currently disabled, for all platforms UNIX-like platforms in GHC HEAD. We have tested it on Linux/x86_64, Linux/x86, OSX/x86_64, OSX/x86 and Linux/s390(unregisterised). To enable it, set DYNAMIC_BY_DEFAULT = YES
in mk/build.mk
or mk/validate.mk
as appropriate.
Windows
Currently, we don't know how to do dynamic-by-default on Windows in a satisfactory way. We can build dynamic libraries, but we don't have a way of telling them where to find their DLLs.
We are not currently working on this, but if anyone is interested in rolling up their sleeves then we would be very grateful! We have some more details on the problem, and how it might be solvable.
Other platforms
We do not know the situation with other platforms, such as iOS and Android. We do not know whether they have a system loader that they can use, or whether it would be useful to keep the GHCi linker around for them.
Bugs
As well as the ticket for implementing dynamic-by-default (#3658), the table below lists the related tickets and the platforms that they affect. Most, if not all, of these would be immediately fixed by switching to dynamic-by-default.
Performance
There are some performance questions to consider before making a decision.
Performance of the dynamic way
Full nofib results showing the effect of switching to dynamic-by-default are available for OS X x86_64, OS X x86, Linux x86_64 and Linux x86. There is also a table of the highlights below. In summary:
(We don't have Windows performance numbers as we don't have dynamic-by-default working on Windows yet).
Binary sizes are way down across the board, as we are now dynamically linking to the libraries.
Things are rosiest on OS X x86_64. On this platform, -fPIC
is always on, so using dynamic libraries doesn't mean giving up a register for PIC. Overall, performance is a few percent better with dynamic by default.
On OS X x86, the situation is not so nice. On x86 we are very short on registers, and giving up another for PIC means we end up around 15% down on performance.
On Linux x86_64 we have more registers, so the effect of giving one up for PIC isn't so pronounced, but we still lose a few percent performance overall.
For unknown reasons, x86 Linux suffers even worse than x86 OS X, with around a 30% performance penalty.
static -> dynamic on OS X x86_64 |
static -> dynamic on OS X x86 |
static -> dynamic on Linux x86_64 |
static -> dynamic on Linux x86 |
|
---|---|---|---|---|
Binary Sizes | ||||
-1 s.d. | -95.8% | -95.8% | -95.8% | -95.9% |
+1 s.d. | -93.1% | -92.8% | -92.6% | -92.4% |
Average | -94.6% | -94.5% | -94.5% | -94.4% |
Run Time | ||||
-1 s.d. | -1.2% | +11.7% | -2.5% | +16.6% |
+1 s.d. | +1.6% | +20.0% | +9.6% | +40.3% |
Average | +0.2% | +15.8% | +3.3% | +27.9% |
Elapsed Time | ||||
-1 s.d. | -6.9% | +10.3% | -2.5% | +16.6% |
+1 s.d. | -0.3% | +20.4% | +9.6% | +40.3% |
Average | -3.7% | +15.2% | +3.3% | +27.9% |
Mutator Time | ||||
-1 s.d. | -1.3% | +8.9% | -5.0% | +18.3% |
+1 s.d. | +1.9% | +18.3% | +7.5% | +46.8% |
Average | +0.3% | +13.5% | +1.1% | +31.8% |
Mutator Elapsed Time | ||||
-1 s.d. | -4.5% | +7.7% | -5.0% | +18.3% |
+1 s.d. | +0.3% | +18.8% | +7.5% | +46.8% |
Average | -2.1% | +13.1% | +1.1% | +31.8% |
GC Time | ||||
-1 s.d. | -1.4% | +16.3% | +5.6% | +13.4% |
+1 s.d. | +1.8% | +27.1% | +11.2% | +24.0% |
Average | +0.2% | +21.6% | +8.4% | +18.6% |
GC Elapsed Time | ||||
-1 s.d. | -1.5% | +15.8% | +5.6% | +13.4% |
+1 s.d. | +1.3% | +25.6% | +11.2% | +24.0% |
Average | -0.1% | +20.6% | +8.4% | +18.6% |
Compile Times | ||||
-1 s.d. | -11.7% | +6.2% | -1.8% | +27.0% |
+1 s.d. | -0.5% | +18.2% | +7.8% | +37.8% |
Average | -6.3% | +12.1% | +2.9% | +32.3% |
OS X x86 vs x86_64
Currently, some people use the x86 version of GHC on OS X for performance reasons. It's not clear for how much longer this will be viable, as other OS X libraries start dropping x86 support.
Full nofib results comparing the two are here for static by default, here for dynamic by default, and here for comparing static x86 to dynamic x86_64. The highlights are in the table below.
The left-hand column shows the status quo: x86_64 only beats x86 in mutator time, and that is a shallow victory as the higher GC time means that total runtime is worse for x86_64.
The middle column shows what the situation would be if we switch to dynamic instead. Allocations, memory use etc remain higher due to all word-sized things being twice as big. However, the combination of x86_64's performance improving, and x86's performance getting worse, means that x86_64 is now faster overall.
The right-hand column shows the difference between static x86 and dynamic x86_64.
x86 -> x86_64 when static by default | x86 -> x86_64 when dynamic by default | x86 static -> x86_64 dynamic | |
---|---|---|---|
Binary Sizes | |||
-1 s.d. | +38.0% | +7.4% | -95.9% |
+1 s.d. | +38.6% | +30.6% | -92.0% |
Average | +38.3% | +18.5% | -94.3% |
Allocations | |||
-1 s.d. | +63.2% | +63.2% | +63.2% |
+1 s.d. | +114.4% | +114.4% | +114.4% |
Average | +87.0% | +87.0% | +87.0% |
Run Time | |||
-1 s.d. | -23.5% | -31.6% | -23.6% |
+1 s.d. | +36.1% | +14.7% | +37.0% |
Average | +2.1% | -11.4% | +2.3% |
Elapsed Time | |||
-1 s.d. | -18.2% | -30.0% | -22.9% |
+1 s.d. | +40.1% | +17.0% | +38.3% |
Average | +7.0% | -9.5% | +3.3% |
Mutator Time | |||
-1 s.d. | -32.4% | -38.8% | -32.4% |
+1 s.d. | +20.1% | +3.0% | +20.7% |
Average | -9.9% | -20.6% | -9.7% |
Mutator Elapsed Time | |||
-1 s.d. | -28.7% | -37.9% | -32.0% |
+1 s.d. | +22.5% | +4.4% | +21.3% |
Average | -6.6% | -19.5% | -9.2% |
GC Time | |||
-1 s.d. | +4.5% | -11.9% | +4.1% |
+1 s.d. | +74.8% | +54.1% | +76.3% |
Average | +35.2% | +16.5% | +35.5% |
GC Elapsed Time | |||
-1 s.d. | +7.9% | -8.0% | +7.1% |
+1 s.d. | +75.1% | +56.7% | +76.0% |
Average | +37.4% | +20.0% | +37.3% |
Total Memory in use | |||
-1 s.d. | -1.7% | -1.9% | -1.8% |
+1 s.d. | +88.9% | +88.9% | +88.9% |
Average | +36.3% | +36.1% | +36.2% |
Compile Times | |||
-1 s.d. | +11.9% | -8.9% | +2.5% |
+1 s.d. | +21.1% | +2.9% | +17.2% |
Average | +16.4% | -3.1% | +9.6% |
Implications of the performance difference
If GHCi uses dynamic libraries by default, then ghci
will need to be dynamically linked. It would make sense to therefore also have ghc
be dynamically linked. This means that any performance difference will also affect the performance of the compiler (this is already accounted for in the "Compile Times" in the nofib results).
It would still be possible to compile programs using the "static way" by giving ghc the -static
flag, and users would be able to configure cabal-install
to do so by default if they wish. Then programs would be exactly the same as they are today. However, this would have the drawback that cabal-install
would need to be configured to install libraries for the static way as well as the dynamic way, so library installation would take twice as long.
Other issues
Cabal support
Currently released versions of Cabal/cabal-install don't handle dynamic-by-default GHCs well, as they don't pass the -static
flag when building for static ways (as they assume that it is enabled by default). We should get fixed versions out as soon as possible (#7439).
Profiling
Should we support both static and dynamic profiling ways? If not, which?
If we support both, it would be a little odd if ghc -prof
used the static profiling libraries. Presumably ghc -prof -dynamic
would use dynamic profiling libraries, but what about ghc -dynamic -prof
? So if we support both then you would presumably need to say ghc -static -prof
to use the static ones.
Currently Cabal has separate --enable-library-profiling
and --enable-shared
flags, but we don't have a way to distinguish static-profiling from dynamic-profiling. If we want to support both then we'll need to add a Cabal flag for it.
Questions
In summary, we need to answer the following questions:
- Should we enable dynamic by default on OS X x86_64?
- Should we enable dynamic by default on OS X x86?
- Should we enable dynamic by default on Linux x86_64?
- Should we enable dynamic by default on Linux x86?
- Should we enable dynamic by default on Windows x86_64?
- Should we enable dynamic by default on Windows x86?
- Should we enable dynamic by default on other platforms?
- For platforms using dynamic by default, should Cabal also install static libraries by default?
- Should
ghc -prof
use dynamic or static libraries when dynamic by default?
For 1 and 3, the performance impact appears negligible (or perhaps even negative) and some bugs will be fixed, so we would suggest that the answer should be yes.
For 2 and 4, there would be a considerable performance impact, but there is again a negligible impact if you instead switch to using x86_64. We believe that this would be feasible for the vast majority of users for whom performance is a concern, and it would greatly simplify the code base, so again we would suggest that the answer should be yes.
For 5 and 6, we will first have to get it working. Windows already uses different code paths quite a lot, so even if we end up deciding not to go dynamic-by-default on Windows, a lot of the ugly, buggy code will be removed. However, there are some known bugs on Windows, and we would be able to remove more code if we switched all platforms, so we are hopeful that we will be able to do dynamic-by-default here too.
For 7, this makes the difference between being able to use ghci and not being able to use ghci, and performance is already compromised for unregisterised platforms. Therefore this looks like a definite yes.
For 8, this is a trade-off between the convenience of always having static libraries available (which may be important for people for whom performance is critical), and doubling the time needed to install extra libraries. On balance, we'd suggest that the answer should be no. If we go for yes, then we'd probably want a little extra intelligence in cabal, which checks that e.g. there is a static version of base installed before trying to install the static way, as development compilers in particular may only be built with the dynamic libraries available.
For 9, we'd suggest that -prof
should use dynamic libraries, but it should also still be possible to use static libraries if also using the -static
flag.