Opened 8 months ago

Closed 5 months ago

#9480 closed bug (fixed)

Segfault in GHC API code using Shelly

Reported by: agibiansky Owned by:
Priority: normal Milestone: 7.10.1
Component: GHC API Version: 7.8.3
Keywords: Cc: trommler
Operating System: Linux Architecture: Unknown/Multiple
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: #8935 Differential Revisions: Phab:D349

Description

A segfault occurs when using the GHC API with Shelly. I do not know why.

Here is a code snippet that triggers it:

import GHC
import GHC.Paths ( libdir )
import DynFlags
 
main = runGhc (Just libdir) $ do
  dflags <- getSessionDynFlags
  setSessionDynFlags $ dflags { hscTarget = HscInterpreted,
          ghcLink = LinkInMemory }
  imports <- mapM parseImportDecl ["import Shelly", "import Prelude"]
  setContext $ map IIDecl imports
  runStmt "shelly undefined" RunToCompletion

Note that on Mac OS X, this has no output whatsoever. The segfault occurs on Ubuntu 14.04, with uname -a outputting the following:

Linux yed 3.13.0-34-generic #60-Ubuntu SMP Wed Aug 13 15:45:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

I cannot find any way to get this to work and have no idea what's up with this or why shelly triggers it.

Change History (11)

comment:1 Changed 8 months ago by rwbarton

More environ nightmares, with the added twist of the GHCi linker. The problem arises here in PrelIOUtils.o:

00000000000002b0 <__hscore_environ>:
 2b0:   48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 2b7 <__hscore_environ+0x7>
                        2b3: R_X86_64_PC32      environ-0x4
 2b7:   c3                      retq   

-Dl has the following to say about this: [edited slightly for readability]

Rel entry   2 is raw( 0x2b3 0x3a00000002 0xfffffffffffffffc)
lookupSymbol: looking up environ
initLinker: start
initLinker: idempotent return
lookupSymbol: symbol not found
`environ' resolves to 0x7ffff7039fd8
Reloc: P = 0x44f0e2f3   S = 0x7ffff7039fd8   A = 0xfffffffffffffffc

Now environ is too far away to use a PC-relative 32-bit reference, so the GHCi linker makes a jump island, but that's nonsense since environ isn't a function but rather a pointer to an array of C strings.

But, also 0x7ffff7039fd8 isn't the right environ according to gdb: there is NULL there. What gdb thinks is environ (and my environment does actually appear there) is 0x7ffff7ffe140... which is also too far away from the relocation site.

I wonder how this all worked when you typed System.Environment.getEnvironment into GHCi 7.6?

comment:2 Changed 8 months ago by rwbarton

Here is the corresponding -Dl output when I build the snippet with 7.6:

Rel entry 101325 is raw(0x323c63 0x113a700000002 0xfffffffffffffffc)
lookupSymbol: looking up environ
initLinker: start
initLinker: idempotent return
lookupSymbol: symbol not found
`environ' resolves to 0x24af250
Reloc: P = 0x40429ca3   S = 0x24af250   A = 0xfffffffffffffffc

This time the linker found environ in the executable, which is good (as it's near where we load object files):

rwbarton@morphism:/tmp/shl$ nm /tmp/shl/shl
[...]
00000000024af250 V environ@@GLIBC_2.2.5
[...]

So, the problem must be that in 7.8 internal_dlsym is finding environ in one of the openedSOs before it gets to looking in the executable. Why does it not look in the executable first? Time to reread #8935 I guess...

comment:3 Changed 8 months ago by agibiansky

You're right, I've confirmed that

import System.Environment
getEnvironment

causes the same issue. Thanks for looking into this! Is there any known workaround that can be used in client code? (i.e. can I fix this in any way in my own programs that use GHC API before 7.10 is released)

comment:4 Changed 8 months ago by rwbarton

Yes, build your program with -dynamic.

comment:5 Changed 7 months ago by trommler

  • Cc trommler added

comment:6 Changed 6 months ago by agibiansky

Does anyone know whether this bug has been looked at, or whether it will be fixed for GHC 7.10? I would be very interested in getting it fixed by 7.10, but I do not know how to get started fixing it, or whether it has already been fixed...

comment:7 follow-up: Changed 6 months ago by trommler

Could you check if the patch (Phab:D349) for #8935 fixes the issue?

comment:8 in reply to: ↑ 7 Changed 6 months ago by trommler

  • Differential Revisions set to Phab:D349
  • Status changed from new to patch

Replying to trommler:

Could you check if the patch (Phab:D349) for #8935 fixes the issue?

I back ported Phab:D349 to ghc 7.8.3 and there is no segfault in the test snippet above anymore.

comment:9 Changed 6 months ago by trommler

  • Milestone set to 7.10.1

Milestone for #8935 is 7.10.1 so setting it here too.

comment:10 Changed 5 months ago by Austin Seipp <austin@…>

In 383733b9191a36e2d3f757700842dbc3855911d9/ghc:

Fix obscure problem with using the system linker (#8935)

Summary:
In a statically linked GHCi symbol `environ` resolves to NULL when
called from a Haskell script.

When resolving symbols in a Haskell script we need to search the
executable program and its dependent (DT_NEEDED) shared libraries
first and then search the loaded libraries.

We want to be able to override functions in loaded libraries later.
Libraries must be opened with local scope (RTLD_LOCAL) and not global.
The latter adds all symbols to the executable program's symbols where
they are then searched in loading order. We want reverse loading order.

When libraries are loaded with local scope the dynamic linker
cannot use symbols in that library when resolving the dependencies
in another shared library. This changes the way files compiled to
object code must be linked into temporary shared libraries. We link
with the last temporary shared library created so far if it exists.
Since each temporary shared library is linked to the previous temporary
shared library the dynamic linker finds the latest definition of a
symbol by following the dependency chain.

See also Note [RTLD_LOCAL] for a summary of the problem and solution.

Cherry-picked commit 2f8b4c

Changed linker argument ordering

On some ELF systems GNU ld (and others?) default to
--as-needed and the order of libraries in the link
matters.

The last temporary shared library, must appear
before all other libraries. Switching the position
of extra_ld_inputs and lib_path_objs does that.

Fixes #8935 and #9186

Reviewers: austin, hvr, rwbarton, simonmar

Reviewed By: simonmar

Subscribers: thomie, carter, simonmar

Differential Revision: https://phabricator.haskell.org/D349

GHC Trac Issues: #8935, #9186, #9480

comment:11 Changed 5 months ago by thoughtpolice

  • Resolution set to fixed
  • Status changed from patch to closed

OK, marking as fixed.

Note: See TracTickets for help on using tickets.