Opened 5 years ago

Closed 16 months ago

Last modified 15 months ago

#4210 closed feature request (fixed)

LLVM: Dynamic Library Support

Reported by: dterei Owned by: dterei
Priority: low Milestone: 7.6.2
Component: Compiler (LLVM) Version: 6.13
Keywords: Cc: william.knop.nospam@…, bgamari@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description (last modified by dterei)

dynamic library status:

 * Supported: Linux 64bit, Mac OSX 64bit
 * Unsupported: Linux 32bit, Mac OSX 32bit, Windows 32bit

The LLVM backend doesn't support dynamic libraries at the moment.

LLC supports a flag called '-relocation-mode' that can be used to support this, it takes the following options:

default        - let the target choose.
static         - static code.
pic            - PIC code.
dynamic-no-pic - Dynamic references but static code

This roughly corresponds to GHC -fPIC, and -dynamic flags.

Linux: Simply adding the correct flag to LLC seems to work fine.
Mac OSX, Windows: Adding the correct flag doesn't work at all, all programs segfault.

Change History (33)

comment:1 Changed 5 years ago by igloo

  • Milestone set to 6.14.1

comment:2 Changed 5 years ago by dterei

Have enabled support for -fPIC and -dynamic on Linux x64, for other platforms have changed DynFlags.hs to issue a warning and drop -fllvm if -dynaimc or -fPIC are also present.

comment:3 Changed 4 years ago by igloo

  • Milestone changed from 7.0.1 to 7.0.2

comment:4 Changed 4 years ago by igloo

  • Milestone changed from 7.0.2 to 7.2.1

comment:5 Changed 4 years ago by simonmar

  • Priority changed from normal to high

It was pointed out to me today that 64 bit OSX has -fPIC turned on by default, because that's what the platform ABI requires (quite sensibly), but GHC disables -fllvm automatically as a result. Hence you can't use -fllvm on 64-bit OSX at all.

I expect we're being too conservative here and it would be fine to enable -fllvm. As I understand it, -dynamic is the real problem, not -fPIC. David?

comment:6 follow-up: Changed 4 years ago by dterei

  • Summary changed from LLVM: Dynamic Library Support to LLVM: Dynamic Library Support1

I don't really know without testing. What is the state of the 64bit version of GHC on OSX? Ben Lippmeier has graciously provided me with access to his OS X machine and I've been using this to do any testing and development. His machine seems to be running a 32bit kernel though (although its running 10.6 on Core i7 hardware). It seems though that I should be able to run a 64bit version of GHC on this still, correct? Also how closely does GHC follow the standard ABI for PIC on OSX? I think you're correct that -dynamic is the real problem though.

comment:7 Changed 4 years ago by dterei

  • Summary changed from LLVM: Dynamic Library Support1 to LLVM: Dynamic Library Support

comment:8 in reply to: ↑ 6 Changed 4 years ago by tibbe

Replying to dterei:

I don't really know without testing. What is the state of the 64bit version of GHC on OSX? Ben Lippmeier has graciously provided me with access to his OS X machine and I've been using this to do any testing and development. His machine seems to be running a 32bit kernel though (although its running 10.6 on Core i7 hardware). It seems though that I should be able to run a 64bit version of GHC on this still, correct?

I'm running a 64-bit GHC with a 32-bit kernel and it works fine:

$ uname -a
Darwin tibell-macbookpro.local 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386 i386 i386

$ sysctl hw | grep 64bit
hw.cpu64bit_capable: 1

$ ghc --info
 [("Project name","The Glorious Glasgow Haskell Compilation System")
 ,("Project version","7.0.3")
 ,("Booter version","7.0.2")
 ,("Stage","2")
 ,("Build platform","x86_64-apple-darwin")
 ,("Host platform","x86_64-apple-darwin")
 ,("Target platform","x86_64-apple-darwin")
 ,("Have interpreter","YES")
 ,("Object splitting","YES")
 ,("Have native code generator","YES")
 ,("Have llvm code generator","YES")
 ,("Support SMP","YES")
 ,("Unregisterised","NO")
 ,("Tables next to code","YES")
 ,("RTS ways","l debug  thr thr_debug thr_l thr_p ")
 ,("Leading underscore","YES")
 ,("Debug on","False")
 ,("LibDir","/Library/Frameworks/GHC.framework/Versions/7.0.3-x86_64/usr/lib/ghc-7.0.3")
 ,("Global Package DB","/Library/Frameworks/GHC.framework/Versions/7.0.3-x86_64/usr/lib/ghc-7.0.3/package.conf.d")
 ,("C compiler flags","[\"-m64\",\"-fno-stack-protector\"]")
 ,("Gcc Linker flags","[\"-m64\"]")
 ,("Ld Linker flags","[\"-arch\",\"x86_64\"]")
 ]

comment:9 Changed 4 years ago by dterei

So I had a test of it on OSX. It mostly works but there are a significant amount of testsuite failures. On 32bit OSX there are no failure with the LLVM backend. On 64bit using -fPIC there are around 70 failures. I haven't looked into why they are failing at all yet, most the ffi tests seem to fail so potentially all the failures are related to that. Not sure when I'll have time to figure out what the failures are caused by so I'm tempted to enable the LLVM backend under 64bit OSX and just add a big warning that it has a good chance of not working... Depends on when 7.2 release happens I guess.

comment:10 Changed 4 years ago by dterei

Test summary:

OVERALL SUMMARY for test run started at Thu Apr 28 13:08:36 EST 2011
    2726 total tests, which gave rise to
   11658 test cases, of which
       0 caused framework failures
    2103 were skipped

    9174 expected passes
     278 expected failures
       0 unexpected passes
     103 unexpected failures

Unexpected failures:
   1185(optllvm)
   1288(optllvm)
   1679(optllvm)
   2276(optllvm)
   2469(optllvm)
   2594(optllvm)
   4038(optllvm)
   4221(optllvm)
   CPUTime001(normal,ghci,threaded1,threaded2,profthreaded)
   T1969(normal)
   T3245(normal,hpc,optasm,profasm,threaded1,threaded2,profthreaded,optllvm)
   T3822(hpc,ghci)
   T4144(optllvm)
   T4801(normal)
   ThreadDelay001(ghci)
   annrun01(normal,hpc,optasm,ghci,threaded1,threaded2,optllvm)
   arith001(optllvm)
   arith008(optllvm)
   arrowrun004(optllvm)
   barton-mangler-bug(optllvm)
   cgrun015(optllvm)
   cgrun034(optllvm)
   cgrun044(optllvm)
   conc021(optllvm)
   conc036(optllvm)
   conc039(optllvm)
   conc040(optllvm)
   derefnull(profthreaded)
   divbyzero(profthreaded)
   dph-diophantine-opt(normal,threaded1,threaded2)
   dynamic001(normal,hpc,optasm,profasm,ghci,threaded1,threaded2,profthreaded,optllvm)
   expfloat(optllvm)
   fed001(optllvm)
   ffi003(optllvm)
   ffi005(optllvm)
   ffi006(optllvm)
   ffi007(optllvm)
   ffi008(optllvm)
   ffi009(optllvm)
   ffi010(optllvm)
   ffi011(optllvm)
   ffi013(optllvm)
   ffi016(optllvm)
   ffi019(optllvm)
   ffi020(optllvm)
   ffi021(optllvm)
   fptr02(normal,hpc,optasm,threaded1,threaded2)
   fun_insts(optllvm)
   hpc_markup_multi_001(normal)
   hpc_markup_multi_002(normal)
   hpc_markup_multi_003(normal)
   joao-circular(optllvm)
   jules_xref(optllvm)
   num009(optllvm)
   num010(optllvm)
   objc-hi(profasm,ghci,profthreaded)
   outofmem(normal)
   signals002(ghci,optllvm)
   signals004(ghci,threaded1,threaded2,profthreaded)
   spec001(normal,hpc,optasm,profasm)
   time002(ghci)
   time004(ghci)

comment:11 Changed 4 years ago by altaic

  • Cc william.knop.nospam@… added

comment:12 Changed 4 years ago by dterei

OK Fixed up 64bit OSX. Commit 22423fc93a008732e426f10f1b545b5d571173f3.

Simon M, perhaps change the priority of this back to Normal or even low now.

comment:13 Changed 4 years ago by dterei

Oh sorry, you also need to pull in commit 50e0db459cb1b1341bbd527a3c450f0930e6ab43 which updates the mangler to work on 64bit OSX.

comment:14 Changed 4 years ago by simonmar

  • Priority changed from high to normal

comment:15 Changed 4 years ago by simonmar

David: in commit 22423fc93a008732e426f10f1b545b5d571173f3 you disabled -fllvm on OS X 64 unconditionally, because it has -fPIC on by default. Was that intentional? This was the problem that originally caused us to raise this ticket to high prio, and it seems to be back again. I'm confused.

comment:16 Changed 4 years ago by dterei

No my intention was to enable it.

    | not (cTargetArch == X86_64 && (cTargetOS == Linux || cTargetOS == OSX)) && 

See the email thread but my parsing of that boolean logic enables OSX 64 bit with the LLVM backend. Am I mistaken?

comment:17 Changed 4 years ago by igloo

  • Milestone changed from 7.2.1 to 7.4.1

comment:18 Changed 3 years ago by dterei

  • Description modified (diff)

comment:19 Changed 3 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1
  • Priority changed from normal to low

comment:20 Changed 3 years ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:21 Changed 2 years ago by dterei

Simon Marlow said in an email:

Also I believe even if it works, the code that LLVM generates for -dynamic
is not very good. This is because it makes every symbol reference a
dynamic reference, whereas the NCG only  makes dynamic references for
symbols in other packages.  It ought to be possible to fix this by using
the right symbol declarations (I'm guessing, I haven't looked into it).

I can't remember if this is correct or not but recording here to look into when I have time as part of dynamic support in LLVM.

comment:22 Changed 2 years ago by carter

has more progress been made on this ticket? Or whats still needed?

comment:23 Changed 2 years ago by dterei

No more progress. If anything, perhaps things have regressed as I haven't looked at dynlib support in LLVM for a fair while. If status is the same though I don't know how much it is worth worrying about supporting 32bit.

comment:24 Changed 16 months ago by bgamari

  • difficulty set to Unknown

I have some work relevant to this in https://github.com/bgamari/ghc/compare/llvm-intra-package. This branch both fixes the LLVM backend's support for dynamic linking and attempts to avoid dynamic references for intra-package calls. That being said, it's still a work in progress.

comment:25 Changed 16 months ago by bgamari

  • Cc bgamari@… added
  • Status changed from new to patch

I just sent this message to the list characterizing the behavior of the NCG and LLVM backends in handling dynamic references. In short, the NCG and LLVM backends both handle intra-package references efficiently. That is, the branch I cited in comment 24 shouldn't be necessary.

In this case, all we need is the fixes in this branch for functional dynamic linking with the LLVM backend. This has been tested on both x86_64 and ARM.

comment:26 Changed 16 months ago by dterei

OK, so just to check, the issue to fix is just the one initially mentioned of using @object annotations instead of @function annotations?

Last edited 16 months ago by dterei (previous) (diff)

comment:27 Changed 16 months ago by bgamari

That is correct. Using @object prevents the linker from sending calls through the PLT.

However, it would be nice to hear Simon confirm that his fears surrounding the efficiency of intra-package calls have been addressed.

comment:28 Changed 16 months ago by thoughtpolice

  • Resolution set to fixed
  • Status changed from patch to closed

Merged, thanks Ben!

comment:29 Changed 15 months ago by bgamari

Unfortunately, things are still broken with BFD ld due to ld bug 16177 (https://sourceware.org/bugzilla/show_bug.cgi?id=16177) wherein ld inexplicably generates R_ARM_COPY relocations where a standard R_ARM_ABS32 relocation would do just fine (as gold does). Sadly, despite the bug being reported two months ago, there has been no activity from the ld side. For now I suspect we'll just have to advise users to use gold on ARM.

comment:30 Changed 15 months ago by bgamari

Actually, this is quite perplexing as I would have expected the R_ARM_COPY relocation to have already been performed by ld.so by the time we get to executing code. That being said, in place of numerous data symbols I see what appears to be trampoline code,

#5  0xb61f8eaa in evacuate (p=0xb5cfd194) at rts/sm/Evac.c:384
384       ASSERTM(LOOKS_LIKE_CLOSURE_PTR(q), "invalid closure, info=%p", q->header.info);
(gdb) print q
$1 = (StgClosure *) 0x16bf8 <Main_zdfEqModule_closure>
(gdb) print *q
$2 = {header = {info = 0x16d64 <ghczmprim_GHCziClasses_DZCEq_static_info>}, payload = 0x16bfc <Main_zdfEqModule_closure+4>}
(gdb) x /5i 0x16d64                                                                                                                                                                            
   0x16d64 <ghczmprim_GHCziClasses_DZCEq_static_info>:  ldr     r3, [r5]
   0x16d68 <ghczmprim_GHCziClasses_DZCEq_static_info+4>:        add     r7, r7, #1
   0x16d6c <ghczmprim_GHCziClasses_DZCEq_static_info+8>:        blx     r3
   0x16d70 <ghczmprim_GHCziClasses_DZCEq_static_info+12>:       bx      lr
   0x16d74:     movs    r0, r0

comment:31 Changed 15 months ago by bgamari

It turns out that the code above is the same as is seen in libHSghc-prim*.so so it seems that the R_ARM_COPY relocation is indeed being correctly performed. Given that this same code works with gold, it seems likely that this isn't trampoline code but instead the correct code for this symbol. I now think that it is the copy itself that is causing the crash due to tables-before-code. While the linker knows to copy the symbol itself, it doesn't copy the info table preceding it.

Trying to make the runtime linker do the right thing with R_ARM_COPY with tables-before-code enabled is going to be very difficult, if not impossible. If this hypothesis is correct, we should disable tables-before-code on ARM when using affected linkers (all versions of bfd ld, as far as I can tell).

comment:32 Changed 15 months ago by simonmar

Yes, R_COPY relocations and tables-next-to-code don't work well together, so I believe we do some trickery on other platforms to avoid the linker from generating them. take a look at compiler/nativeGen/PIC.hs and search for "copy".

comment:33 Changed 15 months ago by bgamari

Here is a patch set to ensure that people aren't bitten by this. We sacrifice support for binutils' ld when dynamic linking on ARM but at least people don't encounter random segfaults.

Note: See TracTickets for help on using tickets.