Opened 3 years ago

Closed 3 years ago

#10058 closed bug (fixed)

Panic: Loading temp shared object failed

Reported by: goldfire Owned by:
Priority: highest Milestone: 7.10.1
Component: Runtime System Version: 7.10.1-rc2
Keywords: Cc: simonmar
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: #10110 Differential Rev(s): Phab:D676
Wiki Page:

Description

I ran into a panic when updating singletons for 7.10. I'm clueless as to what's going on here, so sorry for not minimizing the test case. A little testing has me convinced it's Template Haskell in some way.

To reproduce:

> git clone http://github.com/goldfirere/singletons.git
> cd singletons
> git checkout ghc-loading-panic-test-case
> cabal update
> cabal install --only-dependencies
> cabal configure
> cabal build
> cat dist/build/autogen/cabal_macros.h
# copy the value for CURRENT_PACKAGE_KEY from the end of cabal_macros.h
> cd tests/compile-and-dump
> ghc -c -this-package-key <package key from cabal_macros.h> -i../../dist/build -XTemplateHaskell Singletons/Maybe.hs

You will see something like

ghc: panic! (the 'impossible' happened)
  (GHC version 7.10.0.20150123 for x86_64-apple-darwin):
	Loading temp shared object failed: dlopen(/var/folders/ps/s45r2x1s6r15ws78py_zypl00000gn/T/ghc45837_0/libghc45837_1.dylib, 5): Symbol not found: _mtlzuJNaGzzEkFfL43R3LZZNRlPRm_ControlziMonadziReaderziClass_DZCMonadReader_con_info
  Referenced from: /var/folders/ps/s45r2x1s6r15ws78py_zypl00000gn/T/ghc45837_0/libghc45837_1.dylib
  Expected in: flat namespace
 in /var/folders/ps/s45r2x1s6r15ws78py_zypl00000gn/T/ghc45837_0/libghc45837_1.dylib

Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

I've observed this on a Mac, but Travis has the same problem, so it's not strictly Mac-specific. You can see representative Travis output here.

Why am I doing such a crazy thing? It's part of the singletons test suite, where it's important to test the output of a run of ghc with -ddump-splices. Getting the test cases to compile against the built-but-not-yet-installed singletons object files should work with -this-package-key. I'm sure there's a better way to structure a testsuite, but this general technique works (with -package-name instead of -this-package-key) with 7.8.

Change History (19)

comment:1 Changed 3 years ago by goldfire

This also happens for testing my units library. The testsuite there doesn't do anything particularly strange. However, I can't reproduce locally; only on Travis. See this log.

comment:2 in reply to:  1 Changed 3 years ago by trommler

Architecture: x86_64 (amd64)Unknown/Multiple
Cc: simonmar added
Component: CompilerRuntime System
Operating System: MacOS XUnknown/Multiple
Owner: set to trommler
Type of failure: None/UnknownRuntime crash

Replying to goldfire:

This also happens for testing my units library. The testsuite there doesn't do anything particularly strange. However, I can't reproduce locally; only on Travis. See this log.

Travis is Linux, isn't it?

I'll have a look.

comment:3 Changed 3 years ago by goldfire

Priority: normalhighest

Thanks for looking into this. I'm bumping the priority up to highest, because it would be embarrassing to release 7.10.1 with this bug in it!

comment:4 Changed 3 years ago by trommler

I'm stuck at a Cabal error message:

peter@montebre:~/projects/haskell/singletons> cabal install --only-dependencies
Resolving dependencies...
Downloading syb-0.4.4...
Downloading th-lift-0.7...
Configuring th-lift-0.7...
Configuring syb-0.4.4...
cabal: Distribution/Client/Config.hs:(246,37)-(299,9): Missing field in record construction configProf

This is a fresh build of ghc-7.10-rc2 and cabal-install 1.22.0.0.

peter@montebre:~/projects/haskell/singletons> ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.10.0.20150123
peter@montebre:~/projects/haskell/singletons> cabal --version
cabal-install version 1.22.0.0
using version 1.22.1.0 of the Cabal library 

comment:5 in reply to:  4 ; Changed 3 years ago by trommler

Replying to trommler:

I'm stuck at a Cabal error message: [...]

Alright, compiling the respective Setup.hs files for all dependencies and singletons, I can now reproduce the issue on Linux.

comment:6 Changed 3 years ago by goldfire

Did you report a cabal bug for comment:4? That looks independent from this bug or from singletons.

comment:7 in reply to:  6 Changed 3 years ago by trommler

Replying to goldfire:

Did you report a cabal bug for comment:4? That looks independent from this bug or from singletons.

Done. See #10099.

comment:8 in reply to:  5 Changed 3 years ago by trommler

Replying to trommler:

Replying to trommler:

I'm stuck at a Cabal error message: [...]

Alright, compiling the respective Setup.hs files for all dependencies and singletons, I can now reproduce the issue on Linux.

I know what is going on. Fix is coming.

comment:1 is a different issue but also related to the change in linking (#8935). I will prepare a separate patch for that.

comment:9 Changed 3 years ago by trommler

I think I have a patch. I am currently validating.

Now I get this for the test:

singletons/tests/compile-and-dump> ghc -c -this-package-key singl_0TYtxGjPhBTAZuoSNyonA0 -i../../dist/build -XTemplateHaskell Singletons/Maybe.hs

Singletons/Maybe.hs:9:3:
    Unexpected kind variable ‘a_a4JS’
      Perhaps you intended to use PolyKinds
    In the declaration for type synonym ‘SMaybe’

Singletons/Maybe.hs:9:3:
    Illegal kind signature: ‘Maybe_a4JP a_a4JS’
      Perhaps you intended to use KindSignatures
    In the declaration for type synonym ‘SMaybe’

Singletons/Maybe.hs:9:3:
    Unexpected kind variable ‘a_a4JS’
      Perhaps you intended to use PolyKinds
    In the data type declaration for ‘JustSym0’

Singletons/Maybe.hs:9:3:
    Illegal kind signature: ‘TyFun a_a4JS (Maybe_a4JP a_a4JS)’
      Perhaps you intended to use KindSignatures
    In the data type declaration for ‘JustSym0’

Singletons/Maybe.hs:9:3:
    Unexpected kind variable ‘a_a4JS’
      Perhaps you intended to use PolyKinds
    In the declaration for type synonym ‘JustSym1’

Singletons/Maybe.hs:9:3:
    Illegal kind signature: ‘a_a4JS’
      Perhaps you intended to use KindSignatures
    In the declaration for type synonym ‘JustSym1’
[...]

comment:10 Changed 3 years ago by trommler

Differential Rev(s): Phab:D676

Phab:D676 fixes this ticket but not the issue mentioned in comment:1.

comment:11 Changed 3 years ago by trommler

Status: newpatch

comment:12 in reply to:  9 ; Changed 3 years ago by goldfire

Replying to trommler:

Now I get this for the test:

The errors there are expected. In the actual testsuite, the testsuite driver specifies a whole lot of extensions on the command line. These weren't needed to tickle the panic, so I didn't include them in the instructions.

However, if the comment:1 issue isn't fixed, I think there's more work to be done here, no?

comment:13 in reply to:  12 Changed 3 years ago by trommler

Replying to goldfire:

However, if the comment:1 issue isn't fixed, I think there's more work to be done here, no?

I would like to open a new ticket for the issue in comment:1 because I think it is a different issue.

The undefined symbol in this ticket is in another package whereas in comment:1 it is found in one of the objects of the current package. That symbol, however, should be resolvable by linking with the previous temporary shared object and following the chain of links to the module where the symbol is defined. My current hypothesis is some Linux distributions modify the defaults of GNU ld (and perhaps Gold) and that is responsible for the undefined symbol.

The fix that I have in mind involves checking for a link editor flag called --no-as-needed (and it must be the link editor we use in the installed ghc) and then passing the correct flag to override the default. See: https://ghc.haskell.org/trac/ghc/ticket/9186#comment:26 for more details on that flag.

The issue does not affect OSX. That would reduce the scope of the defect to only those systems that have a modified ld , i.e. some Linux distributions.

I'll create a new ticket with more details tomorrow.

comment:14 Changed 3 years ago by goldfire

Ah -- I understand just enough of that to see why you want a new ticket.

Thanks much for taking a look at this!

comment:15 Changed 3 years ago by trommler

Owner: trommler deleted

comment:16 in reply to:  14 Changed 3 years ago by trommler

Replying to goldfire:

Ah -- I understand just enough of that to see why you want a new ticket.

Done: #10110

comment:17 Changed 3 years ago by Austin Seipp <austin@…>

In 0fcc454329c4e3e0dc4474412bff599d0e9bdfcd/ghc:

Dynamically link all loaded packages in new object

Summary:
As a result of fixing #8935 we needed to open shared libraries
with RTLD_LOCAL and so symbols from packages loaded earlier
cannot be found anymore. We need to include in the link all
packages loaded so far.

This fixes #10058

Test Plan: validate

Reviewers: hvr, simonmar, austin

Reviewed By: austin

Subscribers: rwbarton, thomie

Differential Revision: https://phabricator.haskell.org/D676

GHC Trac Issues: #10058

comment:18 Changed 3 years ago by thoughtpolice

Status: patchmerge

comment:19 Changed 3 years ago by thoughtpolice

Resolution: fixed
Status: mergeclosed

Merged, thanks Peter!

Note: See TracTickets for help on using tickets.