Opened 4 years ago

Last modified 5 months ago

#5620 new bug

Dynamic linking and threading does not work on Windows

Reported by: Lennart Owned by:
Priority: normal Milestone: 7.12.1
Component: Runtime System Version: 7.2.1
Keywords: Cc: bill@…, idhameed@…
Operating System: Windows Architecture: x86
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

On Windows, compile this module:

main = print 42

With this command:

ghc --make -threaded -dynamic Main.hs

Run, and watch it segfault.

Change History (16)

comment:1 Changed 4 years ago by simonmar

  • Milestone set to 7.4.1
  • Priority changed from normal to high

Ok, we understand what's going on here. It relates to our previous discussion about which libraries are linked to the RTS, and naming issues for DLLs (and SOs on Unix).

Basically the base package is linked against the vanilla RTS DLL, and the -threaded flag has caused the main program to be linked against the threaded DLL, with the result that the program is linked against two RTS DLLs, each with their own state. Obviously this goes wrong.

There's a quick workaround, incidentally. Instead of saying -threaded, do something like this:

$ cp c:/ghc/ghc-7.2.2/lib/libHSrts_thr-ghc7.2.2.dll libHSrts-ghc7.2.2.dll

this copies the threaded RTS into the current dir, and renames it to be the same as the vanilla RTS. So when you run the program, Windows will look for the RTS DLL and pick up this one first, and the program will magically be linked against the threaded RTS.

We're looking into how to resolve this mess.

comment:2 Changed 3 years ago by simonpj

  • difficulty set to Unknown

See also #4824

comment:3 Changed 3 years ago by simonpj

  • Owner set to igloo

For 7.4 we'll simply emit a warning with -dynamic on Windows, pointing to this ticket. Ian will do this.

That leaves the real problem still open.

comment:4 Changed 3 years ago by igloo

  • Milestone changed from 7.4.1 to 7.4.2

comment:5 Changed 3 years ago by rl

FWIW, 7.4.1 doesn't emit a warning here.

comment:6 Changed 3 years ago by duncan

So the general problem is explained here http://www.well-typed.com/blog/30. That describes it in the context of ELF, but the problem with PE/.dll is essentially the same. The mechanisms available to solve it are different.

To summarise: the problem is that we want to be able to compile packages as .dlls without yet committing to a particular flavour of RTS. We want to be able to choose when we finally link our program if we'll use the normal, threaded, debug, eventlog etc flavour. Each of these are ABI compatible so it's ok. With static linking this works fine because we only do one big link at the very end, including the RTS. With dynamic linking, we need to link each .dll with its dependencies, including the RTS.

On ELF we currently have a hack that works (in the sense of not segfaulting) but it's not nice or convenient. That hack works simply by not linking package .so files against the RTS at all, leaving all the RTS symbols dangling. Then only the final executable gets linked to the RTS. That works on ELF because ELF allows dangling symbols. The same hack does not work with PE/.dll because, sensibly, PE does not allow dangling symbols in .dlls. So currently on Windows we link package .dlls to the vanilla RTS but that mean if you link the main .exe to the threaded RTS then we are linking to two RTS .dll files and boom!

So, what I think we need is a solution where all the RTS flavours that are ABI compatible should share the same internal dll name, but they should live in different sub-directories. All the package dlls should be linked using an import library for the RTS. The import lib would be common between all the RTS flavours (using an import lib would also resolve the recursive symbol dependencies between the RTS and base libs).

The only question then is how does the final linked .exe find the right RTS dll. That should use assemblies. Assemblies are a mechanism introduced with XP that we should now make use of. Local assemblies let you stick a bunch of dlls in a subdir, with a little xml manifest containing a GUID. The exe is then linked with another xml manifest identifying the GUIDs of the assemblies it needs. The Windows dynamic linker then uses these manifests to find the dlls in the local subdirs. It's in many ways like the ELF RPATH/RUNPATH mechanism, except that it is limited to local subdirs where the exe lives rather than anywhere on the system.

More generally, all ghc packages built as dlls should use the assembly mechanism.

comment:8 Changed 3 years ago by igloo

It turns out that assemblies aren't really better than the

cp c:/ghc/ghc-7.2.2/lib/libHSrts_thr-ghc7.2.2.dll libHSrts-ghc7.2.2.dll

solution. They would just allow us to put the DLLs in subdirectories, to make things a bit tidier.

In particular, an assembly doesn't let you say "Use /path/to/my.dll" (with an absolute path), only "Use foo/my.dll" (with a relative path).

comment:9 Changed 3 years ago by igloo

  • Milestone changed from 7.4.2 to 7.4.3

comment:10 Changed 3 years ago by igloo

  • Milestone changed from 7.4.3 to 7.6.2

comment:11 Changed 2 years ago by rassilon

  • Cc bill@… added

Could we somehow abuse the delay load DLL functionality with a custom delay load helper function?
Or, we could do something similar to what MinGW (If I recall correctly) does in order to be able to process dynamically loaded data relocations

That seems like a really large hammer though. :(

comment:12 Changed 2 years ago by ihameed

  • Cc idhameed@… added

comment:13 Changed 23 months ago by igloo

  • Owner igloo deleted

comment:14 Changed 11 months ago by thoughtpolice

  • Priority changed from high to normal

Lowering priority (these tickets are assigned to older versions, so they're getting bumped as they've been around for a while).

comment:15 Changed 11 months ago by thoughtpolice

  • Milestone changed from 7.6.2 to 7.10.1

Moving to 7.10.1.

comment:16 Changed 5 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

Note: See TracTickets for help on using tickets.