Opened 16 months ago

Closed 8 months ago

#8921 closed bug (fixed)

ghc-stage2 fails with ld: fatal: library -lrt: not found on topHandler02(profthreaded) test

Reported by: AlainODea Owned by: AlainODea
Priority: normal Milestone:
Component: Compiler Version: 7.8.1-rc2
Keywords: Cc: kgardas
Operating System: Solaris Architecture: x86_64 (amd64)
Type of failure: Compile-time crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

On SmartOS ghc-stage2 crashes with an error when trying to run the topHandler02(dyn) test:
ld: fatal: library -lrt: not found

Tracing this with the following DTrace script to stop ghc-stage2 at the point of launching ld:

~/exit_on_ld.d:

#!/usr/sbin/dtrace -s

#pragma D option destructive

syscall::exec*:entry
/copyinstr(arg0) == "/usr/bin/ld"/
{
    trace(pid);
    stop(); 
    system("pargs %d", pid);
    exit(0);
}

This let me observe the command-line arguments to ld:

In ssh session A I started DTrace:

chmod +x ~/exit_on_ld.d
~/exit_on_ld.d

In a ssh session B I started the topHandler02(dyn) test manually with no output redirection:

cd ~/ghc/libraries/base/tests && '/root/ghc/inplace/bin/ghc-stage2' -fforce-recomp -dcore-lint -dcmm-lint -dno-debug-output -no-user-package-db -rtsopts -fno-ghci-history -o topHandler02 topHandler02.hs -O -prof -static -auto-all -threaded

In ssh session A I observed the ld call (ghc-stage2 is now frozen by stop):

CPU     ID                    FUNCTION:NAME
  3   5167                      exece:entry     9186691866:     /opt/local/gcc47/libexec/gcc/x86_64-sun-solaris2.11/4.7.3/collect2 -R/opt/local
argv[0]: /opt/local/gcc47/libexec/gcc/x86_64-sun-solaris2.11/4.7.3/collect2
argv[1]: -R/opt/local/lib/
argv[2]: -Y
argv[3]: P,/lib/amd64:/usr/lib/amd64:/opt/local/lib/
argv[4]: -Qy
argv[5]: -o
argv[6]: topHandler02.o
argv[7]: -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3
argv[8]: -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../../x86_64-sun-solaris2.11/lib/amd64
argv[9]: -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../amd64
argv[10]: -L/lib/amd64
argv[11]: -L/usr/lib/amd64
argv[12]: -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../../x86_64-sun-solaris2.11/lib
argv[13]: -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../..
argv[14]: -R/opt/local/gcc47/x86_64-sun-solaris2.11/lib/amd64
argv[15]: -R/opt/local/gcc47/lib/amd64
argv[16]: -lrt
argv[17]: -r
argv[18]: /tmp/ghc91814_0/ghc91814_6.o
argv[19]: /tmp/ghc91814_0/ghc91814_5.o

I was then able to run them in isolation with GHC's temp files present on disk:

# /opt/local/gcc47/libexec/gcc/x86_64-sun-solaris2.11/4.7.3/collect2 -R/opt/local/lib/ -Y P,/lib/amd64:/usr/lib/amd64:/opt/local/lib/ -Qy -o topHandler02.o -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3 -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../../x86_64-sun-solaris2.11/lib/amd64 -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../amd64 -L/lib/amd64 -L/usr/lib/amd64 -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../../x86_64-sun-solaris2.11/lib -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../.. -R/opt/local/gcc47/x86_64-sun-solaris2.11/lib/amd64 -R/opt/local/gcc47/lib/amd64 -lrt -r /tmp/ghc91814_0/ghc91814_6.o /tmp/ghc91814_0/ghc91814_5.o
ld: fatal: library -lrt: not found
ld: fatal: file processing errors. No output written to topHandler02.o
collect2: error: ld returned 1 exit status

If I omit -lrt from the arguments it succeeds (on Illumos-based systems, including SmartOS, librt is a passthrough to libc):

# /opt/local/gcc47/libexec/gcc/x86_64-sun-solaris2.11/4.7.3/collect2 -R/opt/local/lib/ -Y P,/lib/amd64:/usr/lib/amd64:/opt/local/lib/ -Qy -o topHandler02.o -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3 -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../../x86_64-sun-solaris2.11/lib/amd64 -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../amd64 -L/lib/amd64 -L/usr/lib/amd64 -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../../x86_64-sun-solaris2.11/lib -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../.. -R/opt/local/gcc47/x86_64-sun-solaris2.11/lib/amd64 -R/opt/local/gcc47/lib/amd64 -r /tmp/ghc91814_0/ghc91814_6.o /tmp/ghc91814_0/ghc91814_5.o
# echo $?
0

No errors are emitted and the exit code is 0 (good).

I would like to conditionally omit -lrt from the ld arguments.

Where do I need to look for where ghc-stage2 populates the arguments to ld?

Attachments (1)

IllumosSolarisLinkerOptions.patch (525 bytes) - added by AlainODea 16 months ago.
Correct Illumos/Solaris Linker Options

Download all attachments as: .zip

Change History (11)

comment:1 Changed 16 months ago by AlainODea

  • Summary changed from ghc-stage2 fails with ld: fatal: library -lrt: not found on topHandler02(dyn) test to ghc-stage2 fails with ld: fatal: library -lrt: not found on topHandler02(profthreaded) test

comment:2 Changed 16 months ago by AlainODea

Rich Lowe gave some solid insight on this on smartos-discuss:

You're passing both -lrt and -r, which is likely to cause ld to look for an archive
library (librt.a, which will never exist) and not the shared object, since -r asks for
relocatable output.

However, this causes a separate problem with unresolved references:

# /opt/local/gcc47/libexec/gcc/x86_64-sun-solaris2.11/4.7.3/collect2 -R/opt/local/lib/ -Y P,/lib/amd64:/usr/lib/amd64:/opt/local/lib/ -Qy -o topHandler02.o -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3 -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../../x86_64-sun-solaris2.11/lib/amd64 -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../amd64 -L/lib/amd64 -L/usr/lib/amd64 -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../../x86_64-sun-solaris2.11/lib -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../.. -R/opt/local/gcc47/x86_64-sun-solaris2.11/lib/amd64 -R/opt/local/gcc47/lib/amd64 -lrt /tmp/ghc93957_0/ghc93957_6.o /tmp/ghc93957_0/ghc93957_5.o
Undefined                       first referenced
 symbol                             in file
era                                 /tmp/ghc93957_0/ghc93957_6.o
base_GHCziIOziException_zdfExceptionAsyncExceptionzuzdctoException_info /tmp/ghc93957_0/ghc93957_6.o
CC_ID                               /tmp/ghc93957_0/ghc93957_5.o
pushCostCentre                      /tmp/ghc93957_0/ghc93957_6.o
CCS_DONT_CARE                       /tmp/ghc93957_0/ghc93957_6.o
CCS_ID                              /tmp/ghc93957_0/ghc93957_5.o
base_GHCziIOziException_zdfExceptionAsyncExceptionzuzdctoException_closure /tmp/ghc93957_0/ghc93957_6.o
enterFunCCS                         /tmp/ghc93957_0/ghc93957_6.o
newCAF                              /tmp/ghc93957_0/ghc93957_6.o
CC_LIST                             /tmp/ghc93957_0/ghc93957_5.o
base_GHCziIOziException_UserInterrupt_closure /tmp/ghc93957_0/ghc93957_6.o
stg_bh_upd_frame_info               /tmp/ghc93957_0/ghc93957_6.o
CCS_LIST                            /tmp/ghc93957_0/ghc93957_5.o
stg_IND_STATIC_info                 /tmp/ghc93957_0/ghc93957_6.o
stg_raiseIOzh                       /tmp/ghc93957_0/ghc93957_6.o
base_GHCziTopHandler_runMainIO1_info /tmp/ghc93957_0/ghc93957_6.o
base_GHCziTopHandler_runMainIO1_closure /tmp/ghc93957_0/ghc93957_6.o
ld: fatal: symbol referencing errors. No output written to topHandler02.o
collect2: error: ld returned 1 exit status

I don't see the -r option in GNU ld, so this seems to be SunOS/Illumos specific.

comment:3 Changed 16 months ago by AlainODea

This line needs to be removed:
https://github.com/ghc/ghc/blob/master/compiler/main/DynFlags.hs#L1216

It will break -profthreaded on all Solaris and Illumos OSes released since 2006.

It leads to a paradox where gcc is handed -r (relocatable) and -lrt (link librt) which combined requires librt.a which doesn't and won't ever exist on any Solaris or Illumos system.

I'll provide a patch once I've done some more testing.

Changed 16 months ago by AlainODea

Correct Illumos/Solaris Linker Options

comment:4 Changed 16 months ago by AlainODea

With the attached IllumosSolarisLinkerOptions.patch the topHandler02 tests all pass on SmartOS modulo the defect in the test framework that miscalculates the exit code.

Last edited 16 months ago by AlainODea (previous) (diff)

comment:5 Changed 16 months ago by AlainODea

  • Status changed from new to patch

comment:6 Changed 16 months ago by AlainODea

  • Owner set to AlainODea

comment:7 Changed 15 months ago by AlainODea

This will break -threaded on Solaris 9 and earlier.

On Solaris 10, OpenSolaris, Illumos, and Solaris 11 librt is included in libc.

An alternative would be to identify recent Solaris/Illumos operating systems as something other than OSSolaris2 when generating GHCConstantsHaskellType. That seems far more problematic as it would require changes everywhere OSSolaris2 is used.

Doing a similar trick for Solaris 9 and earlier would have the same problem.

Solaris 10 was released on 2005-01-31. Solaris 9's latest release is 9/05 which was released on 2005-09-03.

I think support for -threaded on Solaris 9 and earlier can be safely excluded by default. Users who need -threaded on GHC 7.8+ on those systems can easily reverse this patch in their build process.

Last edited 15 months ago by AlainODea (previous) (diff)

comment:8 Changed 11 months ago by thoughtpolice

  • Cc kgardas added
  • Status changed from patch to infoneeded

Karel, can you please look at this? I'm generally fine with this change, but you're the Solaris master, so you should probably check as well so it doesn't break something for you...

comment:9 follow-up: Changed 11 months ago by kgardas

Hi,
sorry Alain for not including your patch. I've not known about it and fixed this in the same way in

commit cc3717597597c031dd8402c443f40f76d432c044
Author: Karel Gardas <[email protected]>
Date:   Mon Jul 28 07:49:12 2014 -0500

    do not link with -lrt on Solaris for threaded way
    
    Summary:
    This patch removes linking with rt library on Solaris
    for threaded way. The reason is simple it casuses few ffi related tests
    failures and also is not needed anymore.
    
    Test Plan: validate
    
    Reviewers: austin
    
    Reviewed By: austin
    
    Subscribers: phaskell, simonmar, relrod, carter
    
    Differential Revision: https://phabricator.haskell.org/D95

so feel free to close this issue.

comment:10 in reply to: ↑ 9 Changed 8 months ago by thomie

  • Resolution set to fixed
  • Status changed from infoneeded to closed

Replying to kgardas:

so feel free to close this issue.

Note: See TracTickets for help on using tickets.