Opened 8 months ago

Closed 5 months ago

Last modified 4 months ago

#13541 closed feature request (fixed)

Make it easier to use the gold linker

Reported by: bgamari Owned by:
Priority: highest Milestone: 8.2.1
Component: Compiler Version: 8.0.1
Keywords: Cc: jaffacake, rwbarton, hvr, angerman, geekosaur, nh2, michalt, RyanGlScott, duog
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #13810, #13739 Differential Rev(s): Phab:D3449, Phab:D3694
Wiki Page:

Description (last modified by bgamari)

As pointed out in #4862, the gold linker is significantly faster than BFD ld. Currently we use whatever linker gcc uses by default. This is an unfortunate situation for users as few packagers take the effort to configure their builds to use gold.

I think we should consider the following,

Introduce a configure flag (to both the source distribution, and the distributed binary distributions), --enable-gold. When enabled, configure will check for the functioning of gcc -fuse-ld=gold. If found to work -fuse-ld=gold would be added to GHC's optlc. The flag would throw an error on non-ELF platforms (which are not supported by gold).

While there is admittedly not a whole lot of precedent for this, the status quo means we are leaving a significant bit of compiler performance on the table in a majority of cases. Given that stack uses GHC's official bindists, we should try to improve this situation.

In fact, I would even weakly suggest that we might consider enabling --enable-gold the default behavior, requiring the user to explicitly pass --disable-gold if they want the current behavior.

Change History (35)

comment:1 Changed 8 months ago by bgamari

Description: modified (diff)

comment:2 Changed 8 months ago by bgamari

Description: modified (diff)

comment:3 Changed 8 months ago by bgamari

Another unfortunate aspect of the status quo is that the user may think that passing LD=gold to ./configure will configure the compiler to use gold, but they would be wrong.

Another approach that may be would be to check whether LD is set to gold and if so check for a functional -fuse-ld=gold.

comment:4 Changed 8 months ago by bgamari

Description: modified (diff)

comment:5 Changed 8 months ago by angerman

A few things to note:

  • gold, does ELF only. So this should only be available and enabled if we are targeting elf.
  • I'd also suggest that this is only enabled by default if gcc is used. If clang is used or gcc /is/ clang, don't enable gold.

I still believe that this kind of configuration should be part of the toolchain and not part of ghc though. I'm also a bit confused why gcc would not respect the LD env var. Maybe someone with better knowledge of the gcc toolchain knows.

comment:6 Changed 8 months ago by thoughtpolice

Gold works perfectly fine with Clang on Linux. Why block that combination?

comment:7 Changed 8 months ago by angerman

I might have gone a bit overboard here. I'm not sure if this ticket is related to https://phabricator.haskell.org/D3351, however clang doesn't understand -fuse-ld (at least not the one that comes with the android toolchain, not sure if this statement holds universally), and thus forcing that flag results in a non-functioning compiler.

I certainly don't want to block anything. I just don't want to have anything forced on me by default, which might run counter to my toolchain configuration. In my view this should be part of the toolchain, and ghc should ensure to pick up the configuration from the toolchain proper instead of doing the decision on its own.

I guess a similar case could be made for using lld?

comment:8 Changed 8 months ago by bgamari

To be clear, I don't think there is any need to block any particular combinations. We simply teach configure to check that -fuse-ld=gold works. If the compiler doesn't support it then the check fails and we either throw an error or proceed with the status quo.

comment:9 Changed 8 months ago by bgamari

Apparently lld is another factor of three faster than gold, so I suppose if we want to go this route we should ensure the solution will extend to that case as well.

I'd love to hear others opinions on this general direction.

comment:10 in reply to:  7 ; Changed 7 months ago by nh2

Replying to angerman:

clang doesn't understand -fuse-ld (at least not the one that comes with the android toolchain, not sure if this statement holds universally)

This seems to be Android toolchain specific.

Upstream clang suppports -fuse-ld since 2014: https://github.com/llvm-mirror/clang/commit/bab68c96f15867af6938d9c8f922d59d31351cad

comment:11 in reply to:  5 ; Changed 7 months ago by nh2

Replying to angerman:

I'm also a bit confused why gcc would not respect the LD env var.

I guess this may be controversial, but I have always liked the fact that cabal and GHC rely as little as possible on environment variables.

It has made it much easier many times for me to debug ghc issues (funnily, especially the linker investigation) because I can see all relevant inputs to a ghc invocation simply in ps or strace, and re-run them to isolate problems without accidentally not replicating the environment correctly.

comment:12 Changed 7 months ago by nh2

Cc: nh2 added

comment:13 in reply to:  10 Changed 7 months ago by angerman

Replying to nh2:

Replying to angerman:

clang doesn't understand -fuse-ld (at least not the one that comes with the android toolchain, not sure if this statement holds universally)

This seems to be Android toolchain specific.

Upstream clang suppports -fuse-ld since 2014: https://github.com/llvm-mirror/clang/commit/bab68c96f15867af6938d9c8f922d59d31351cad

This likely has been a misreading of the error on my side, as noted in https://phabricator.haskell.org/D3351. The LD that comes witht he android toolchain is called <toolchain>-ld or <toolchain>-ld.gold, and both are effectively gold, and there is no tool called just ld.gold, there is one called ld, but that is ld64, and would be horribly wrong to use.

comment:14 in reply to:  11 Changed 7 months ago by angerman

Replying to nh2:

Replying to angerman:

I'm also a bit confused why gcc would not respect the LD env var.

I guess this may be controversial, but I have always liked the fact that cabal and GHC rely as little as possible on environment variables.

It has made it much easier many times for me to debug ghc issues (funnily, especially the linker investigation) because I can see all relevant inputs to a ghc invocation simply in ps or strace, and re-run them to isolate problems without accidentally not replicating the environment correctly.

My personal issue with not respecting env variables is that, without respecting the environment, one has to

  • have explicit flags for each possible configuration value, that otherwise would have been taken from the environment.
  • by not using the environment, one is posed to break tooling that depends on the environment.

GHC already has a lot of logic to find tools at configuration time and store their paths. FIND_LD already tries to detect gold, (and fails for the android toolchain...)

I am much more in favor of doing tool checking rather than magic. What I mean is this: instead of trying hard to find some tool (say ld), use $LD. However if we know we want gold, try $LD --version to verify it actually *is* gold. And if it's not, put out a warning that $LD is not set to gold and that this is known not to work. However if you want to ignore the error pass --compat-warning-only. Which would then print a warning:

Warning: linker ($LD) does not seem to be gold. Continuing anyway due to --compat-warning-only.

instead of

Error: linker ($LD) does not seem to be gold. bfd is known not to work. To continue anyway, pass --compat-warning-only.

Then again, this makes me wonder why we test for gold, and not against bfd in the first place? Why force gold, if lld is fine as well, when all we want is to make sure we don't use a buggy/broken/slow linker called bfd?

comment:15 Changed 7 months ago by michalt

Cc: michalt added

comment:16 Changed 5 months ago by RyanGlScott

Cc: RyanGlScott added

comment:17 Changed 5 months ago by duog

Cc: duog added

comment:18 Changed 5 months ago by bgamari

Differential Rev(s): Phab:D3449

#13810 demonstrates the terrible brittleness of the current state of affairs, where we rely on users to explicitly set their choice of linker.

comment:19 Changed 5 months ago by bgamari

I've updated Phab:D3449 to reflect my current thinking on this. Namely, we provide a configure flag that indicates that the user doesn't mind if GHC overrides the system's default linker. If this flag is passed, we use either gold or lld, if available. The user can explicitly request one or the other by passing the LD=... variable to configure.

Does this seem reasonable?

comment:20 Changed 5 months ago by nh2

The user can explicitly request one or the other by passing the LD=... variable to configure.

@bgamari: To clarify, will GHC still at run-time detect which linker is available, and can the user still at run-time tell GHC which linker to use? At configure time, gold may not be installed, or it may be uninstalled afterwards, or the user maybe be using a bindist but for some reasons wants to force GHC to use ld or gold (for example, if gold doesn't work for them for some reason).

comment:21 Changed 5 months ago by bgamari

The diff doesn't touch the runtime linker-detection logic. It also doesn't provide a way for the user to override the linker choice, but this could be fixed.

However, I do wish we could drop the runtime probing at some point. Currently we start gcc -v and ld -v on every single GHC compilation. On platforms like Windows this can really add up. Even on Linux, where process spawning is relatively fast, it's probably 5 to 10 milliseconds per execution.

comment:22 Changed 5 months ago by bgamari

Simon Marlow has said that he would really like to see this happen in 8.2.1 due to the regressions in #13739.

comment:23 Changed 5 months ago by bgamari

Milestone: 8.2.1
Priority: normalhighest

comment:24 Changed 5 months ago by bgamari

Owner: set to bgamari

comment:25 Changed 5 months ago by bgamari

Status: newpatch

comment:26 Changed 5 months ago by Ben Gamari <ben@…>

In 625143f4/ghc:

configure: Coerce gcc to use $LD instead of system default

The configure script will now try to coerce gcc to use the linker
pointed to by $LD instead of the system default (typically bfd ld).
Moreover, we now check for `ld.gold` and `ld.lld` before trying `ld`.

The previous behavior can be reverted to by using the new
--disable-ld-override flag.

On my machine gold seems to trigger an apparent infelicity in
constructor behavior, causing T5435_asm to fail. I've opened #13883 to
record this issue and have accepted the questionable constructor
ordering for the time being.

Test Plan: Validate with `config_args='--enable-ld-override'`

Reviewers: austin, hvr, simonmar

Subscribers: duog, nh2, rwbarton, thomie, erikd, snowleopard

GHC Trac Issues: #13541, #13810, #13883

Differential Revision: https://phabricator.haskell.org/D3449

comment:27 Changed 5 months ago by bgamari

Status: patchmerge

comment:28 Changed 5 months ago by bgamari

Resolution: fixed
Status: mergeclosed

comment:29 Changed 5 months ago by bgamari

Differential Rev(s): Phab:D3449Phab:D3449, Phab:D3694
Owner: bgamari deleted
Resolution: fixed
Status: closednew

I found a bug in the above. See Phab:D3694 for the fix.

comment:30 Changed 5 months ago by bgamari

Status: newpatch

comment:31 Changed 5 months ago by Ben Gamari <ben@…>

In 960918bd/ghc:

Add -fuse-ld flag to CFLAGS during configure

The decisions made by configure later in the script may depend upon the
linker used. Consequently, it is important that configure uses the same
linker as GHC will eventually use.

For instance, on Nix I found that a program requiring `libpthread` would
link fine with only `-lrt` when linked with BFD ld. However, with gold
we needed to explicitly provide the `-lpthread` dependency. Presumably
the former would happily loaded any `NEEDED` libraries whereas the
latter wants them explicitly given. Regardless, since `configure`'s
`NEED_PTHREAD_LIB` check didn't use the `-fuse-ld` flag that GHC would
eventually use, we inferred the wrong value, resulting in link errors
later in the build.

Test Plan: Validate

Reviewers: austin, hvr

Subscribers: rwbarton, thomie, erikd

GHC Trac Issues: #13541

Differential Revision: https://phabricator.haskell.org/D3694

comment:32 Changed 5 months ago by bgamari

Resolution: fixed
Status: patchclosed

comment:33 Changed 5 months ago by Ben Gamari <ben@…>

In fcd2db14/ghc:

configure: Ensure that we don't set LD to unusable linker

Previously if we found an unusable linker in PATH (e.g. ld.lld on OS X)
we would notice the -fuse-ld=... was broken, but neglected to reset LD
to a usable linker. This resulted in brokenness on OS X when lld is in
PATH.

Test Plan: Validate on OS X with lld in PATH

Reviewers: austin, hvr, angerman

Reviewed By: angerman

Subscribers: rwbarton, thomie, erikd, angerman

GHC Trac Issues: #13541

Differential Revision: https://phabricator.haskell.org/D3713

comment:35 Changed 4 months ago by Ben Gamari <ben@…>

In d08b9ccd/ghc:

configure: Ensure that user's LD setting is respected

This broke in the fix for #13541.
Note: See TracTickets for help on using tickets.