Opened 2 years ago

Last modified 6 months ago

#10161 new bug

GHC does not relink if we link against a new library with old timestamp

Reported by: nh2 Owned by:
Priority: normal Milestone:
Component: Driver Version: 7.8.4
Keywords: Cc: nh2
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #10966 Differential Rev(s):
Wiki Page:


Test case for reproducing:

myexe is an executable that depends on mylib, whose myfun = putStrLn "output 1".

If we install mylib, compile myexe, then change mylib's code to myfun = putStrLn "output 2", re-install it, and then compile myexe, then GHC does not notice that the library code changed.

It avoids re-linking myexe, with the result that the program prints output 1 when you told it to compile against code that prints output 2.

Note that I had to use NOINLINE myfun to trigger the bug, since otherwise myfun's code would have ended up in the interface file, thus changing the package ID, which naturally forces GHC to relink (even re-compile).

With the NOINLINE, the package IDs of the two different versions of myfun are completely identical. I think that is correct, since the package ID only hashes the API and ABI, not the actual implementation, right?

How does GHC decide when to relink? I think in the present case, it doesn't notice that the object/archive file of the library changed. Does it check that somehow? Just looking at API and API can't be enough to make a decision for linking.

Change History (6)

comment:1 Changed 2 years ago by ezyang

Happens with the RTS too. One answer is, if your library changes, its version number should change (but this isn't very satisfactory!)

Last edited 6 months ago by ezyang (previous) (diff)

comment:2 Changed 16 months ago by thomie

Component: CompilerDriver

Might have same underlying cause as #10966.

comment:3 Changed 16 months ago by ezyang

So, I'm pretty sure the problem (in this bug) is this:

  1. You made an ABI-compatible change to an upstream library.
  2. Recompilation avoidance decides that myexe does not need to be recompiled (rightly so)
  3. The decision whether to relink or not depends purely on whether or not any local modules got recompiled. Which they did not.

It sort of sounds like we need to store some extra metadata in the final linked executable which talks about the precise objects involved, so we know when to relink.

This analysis is wrong.

Last edited 6 months ago by ezyang (previous) (diff)

comment:4 Changed 14 months ago by ezyang

There are a few ways to go about fixing this, but we have to be careful to keep GHC's compilation results deterministic.

One consequence of this is that we CANNOT store the paths that were linked to produce the final executable in the executable: we don't want to bake those paths into the build product. But then how does GHC know what to query in order to find out if linking is necessary?

My conclusion is that GHC has to (1) somehow run ld in a dry run mode (I could see no flag which actually implemented this) or (2) reimplement ld's library finding logic ourselves, so that we can guess what the actual files we're going to depend on are and then make a linking decision.

Last edited 6 months ago by ezyang (previous) (diff)

comment:5 Changed 6 months ago by ezyang

Summary: GHC does not relink if a library's code changedGHC does not relink if we link against a new library with old timestamp

comment:6 Changed 6 months ago by ezyang

I think thomie is right and the base problem is the same: the way we decide to relink is if any of the inputs to the linker are newer than the executable. In the provided test case, both libraries are built before the initial link, so there's nothing newer. A workaround is to touch the library after you do an operation like this.

Note: See TracTickets for help on using tickets.