Opened 7 years ago

Closed 7 years ago

Last modified 12 months ago

#4013 closed bug (fixed)

build fails on OS X: Invalid Mach-O file:Address out of bounds while relocating object file

Reported by: igloo Owned by:
Priority: high Milestone: 7.2.1
Component: Compiler Version: 6.13
Keywords: Cc: pho@…, bos@…, johan.tibell@…, dgoldsmith@…
Operating System: MacOS X Architecture: x86
Type of failure: Building GHC failed Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Validate goes through on OS X, but a normal build (such as that done by the nightly builder) fails with:

"inplace/bin/ghc-stage2"   -H32m -O    -package-name dph-seq-0.4.0 -hide-all-packages -i -ilibraries/dph/dph-seq/../dph-common -ilibraries/dph/dph-seq/dist-install/build -ilibraries/dph/dph-seq/dist-install/build/autogen -Ilibraries/dph/dph-seq/dist-install/build -Ilibraries/dph/dph-seq/dist-install/build/autogen -Ilibraries/dph/dph-seq/.    -optP-include -optPlibraries/dph/dph-seq/dist-install/build/autogen/cabal_macros.h -package array-0.3.0.0 -package base-4.2.0.0 -package dph-base-0.4.0 -package dph-prim-seq-0.4.0 -package ghc-6.13.20100425 -package ghc-prim-0.2.0.0 -package random-1.0.0.2 -package template-haskell-2.4.0.0  -Odph -funbox-strict-fields -haddock -fcpr-off -fdph-this -package-name dph-seq -XTypeFamilies -XGADTs -XRankNTypes -XBangPatterns -XMagicHash -XUnboxedTuples -XTypeOperators -O2 -XGenerics -fno-warn-deprecated-flags -Wwarn     -odir libraries/dph/dph-seq/dist-install/build -hidir libraries/dph/dph-seq/dist-install/build -stubdir libraries/dph/dph-seq/dist-install/build -hisuf hi -osuf  o -hcsuf hc -c libraries/dph/dph-seq/../dph-common/Data/Array/Parallel/Lifted/PArray.hs -o libraries/dph/dph-seq/dist-install/build/Data/Array/Parallel/Lifted/PArray.o
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package array-0.3.0.0 ... linking ... done.
Loading package containers-0.3.0.0 ... linking ... done.
Loading package filepath-1.1.0.4 ... linking ... done.
Loading package old-locale-1.0.0.2 ... linking ... done.
Loading package old-time-1.0.0.4 ... linking ... done.
Loading package unix-2.4.0.1 ... linking ... done.
ghc-stage2: internal error: Invalid Mach-O file:Address out of bounds while relocating object file
    (GHC version 6.13.20100425 for i386_apple_darwin)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
make[1]: *** [libraries/dph/dph-seq/dist-install/build/Data/Array/Parallel/Lifted/PArray.o] Abort trap
make: *** [all] Error 2

This one passed: http://darcs.haskell.org/ghcBuilder/builders/tn23/8.html

This one failed: http://darcs.haskell.org/ghcBuilder/builders/tn23/9.html

Attachments (2)

ghc66215_0.split_s.gz (57.8 KB) - added by igloo 7 years ago.
ghc66215_0.split__1.s (6.0 KB) - added by igloo 7 years ago.

Download all attachments as: .zip

Change History (29)

comment:1 Changed 7 years ago by ujihisa

Architecture: Unknown/Multiplex86
Version: 6.12.26.13

Same here. Precisely my Mac OS X is Leopard (not Snow Leopard)

$ make
===--- updating makefiles phase 0
make -r --no-print-directory -f ghc.mk phase=0 just-makefiles
===--- updating makefiles phase 1
make -r --no-print-directory -f ghc.mk phase=1 just-makefiles
===--- updating makefiles phase 2
make -r --no-print-directory -f ghc.mk phase=2 just-makefiles
===--- updating makefiles phase 3
make -r --no-print-directory -f ghc.mk phase=3 just-makefiles
===--- finished updating makefiles
make -r --no-print-directory -f ghc.mk all
"inplace/bin/ghc-stage2"   -H32m -O    -package-name dph-seq-0.4.0 -hide-all-packages -i -ilibraries/dph/dph-seq/../dph-common -ilibraries/dph/dph-seq/dist-install/build -ilibraries/dph/dph-seq/dist-install/build/autogen -Ilibraries/dph/dph-seq/dist-install/build -Ilibraries/dph/dph-seq/dist-install/build/autogen -Ilibraries/dph/dph-seq/.    -optP-include -optPlibraries/dph/dph-seq/dist-install/build/autogen/cabal_macros.h -package array-0.3.0.0 -package base-4.2.0.0 -package dph-base-0.4.0 -package dph-prim-seq-0.4.0 -package ghc-6.13.20100424 -package ghc-prim-0.2.0.0 -package random-1.0.0.2 -package template-haskell-2.4.0.0  -Odph -funbox-strict-fields -haddock -fcpr-off -fdph-this -package-name dph-seq -XTypeFamilies -XGADTs -XRankNTypes -XBangPatterns -XMagicHash -XUnboxedTuples -XTypeOperators -O2 -XGenerics -fno-warn-deprecated-flags -Wwarn     -odir libraries/dph/dph-seq/dist-install/build -hidir libraries/dph/dph-seq/dist-install/build -stubdir libraries/dph/dph-seq/dist-install/build -hisuf hi -osuf  o -hcsuf hc -c libraries/dph/dph-seq/../dph-common/Data/Array/Parallel/Lifted/PArray.hs -o libraries/dph/dph-seq/dist-install/build/Data/Array/Parallel/Lifted/PArray.o
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package array-0.3.0.0 ... linking ... done.
Loading package containers-0.3.0.0 ... linking ... done.
Loading package filepath-1.1.0.4 ... linking ... done.
Loading package old-locale-1.0.0.2 ... linking ... done.
Loading package old-time-1.0.0.4 ... linking ... done.
Loading package unix-2.4.0.1 ... linking ... done.
Loading package directory-1.0.1.1 ... linking ... done.
Loading package pretty-1.0.1.1 ... linking ... done.
Loading package process-1.0.1.2 ... linking ... done.
ghc-stage2: internal error: Invalid Mach-O file:Address out of bounds while relocating object file
    (GHC version 6.13.20100424 for i386_apple_darwin)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
make[1]: *** [libraries/dph/dph-seq/dist-install/build/Data/Array/Parallel/Lifted/PArray.o] Abort trap
make: *** [all] Error 2

comment:2 Changed 7 years ago by thorkilnaur

Inspired by a successful build (http://darcs.haskell.org/ghcBuilder/builders/tn23/8/8.html)

"inplace/bin/ghc-stage2"   -H32m -O    -package-name dph-par-0.4.0 -hide-all-packages -i -ilibraries/dph/dph-par/../dph-common -ilibraries/dph/dph-par/dist-install/build -ilibraries/dph/dph-par/dist-install/build/autogen -Ilibraries/dph/dph-par/dist-install/build -Ilibraries/dph/dph-par/dist-install/build/autogen -Ilibraries/dph/dph-par/.    -optP-include -optPlibraries/dph/dph-par/dist-install/build/autogen/cabal_macros.h -package array-0.3.0.0 -package base-4.2.0.0 -package dph-base-0.4.0 -package dph-prim-par-0.4.0 -package ghc-6.13.20100423 -package ghc-prim-0.2.0.0 -package random-1.0.0.2 -package template-haskell-2.4.0.0  -Odph -funbox-strict-fields -haddock -fcpr-off -fdph-this -package-name dph-par -XTypeFamilies -XGADTs -XRankNTypes -XBangPatterns -XMagicHash -XUnboxedTuples -XTypeOperators -O2 -XGenerics -fno-warn-deprecated-flags -Wwarn     -odir libraries/dph/dph-par/dist-install/build -hidir libraries/dph/dph-par/dist-install/build -stubdir libraries/dph/dph-par/dist-install/build -hisuf hi -osuf  o -hcsuf hc -c libraries/dph/dph-par/../dph-common/Data/Array/Parallel/Lifted/PArray.hs -o libraries/dph/dph-par/dist-install/build/Data/Array/Parallel/Lifted/PArray.o
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package array-0.3.0.0 ... linking ... done.
Loading package containers-0.3.0.0 ... linking ... done.
Loading package filepath-1.1.0.4 ... linking ... done.
Loading package old-locale-1.0.0.2 ... linking ... done.
Loading package old-time-1.0.0.4 ... linking ... done.
Loading package unix-2.4.0.1 ... linking ... done.
Loading package directory-1.0.1.1 ... linking ... done.
Loading package pretty-1.0.1.1 ... linking ... done.
Loading package process-1.0.1.2 ... linking ... done.
Loading package Cabal-1.9.0 ... linking ... done.
Loading package bytestring-0.9.1.5 ... linking ... done.
Loading package binary-0.5.0.2 ... linking ... done.
Loading package bin-package-db-0.0.0.0 ... linking ... done.
Loading package hpc-0.5.0.5 ... linking ... done.
Loading package template-haskell ... linking ... done.
Loading package ghc-6.13.20100423 ... linking ... done.
Loading package time-1.1.4 ... linking ... done.
Loading package random-1.0.0.2 ... linking ... done.
Loading package dph-base-0.4.0 ... linking ... done.
Loading package dph-prim-interface-0.4.0 ... linking ... done.
Loading package dph-prim-seq-0.4.0 ... linking ... done.
Loading package dph-prim-par-0.4.0 ... linking ... done.
Loading package ffi-1.0 ... linking ... done.

I tried

thorkil-naurs-intel-mac-mini:~/tn/builders/GHCBuilder/tn23/builder/tempbuild/build thorkilnaur$ inplace/bin/ghc-stage2 --interactive -package Cabal-1.9.0
GHCi, version 6.13.20100501: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package array-0.3.0.0 ... linking ... done.
Loading package containers-0.3.0.0 ... linking ... done.
Loading package filepath-1.1.0.4 ... linking ... done.
Loading package old-locale-1.0.0.2 ... linking ... done.
Loading package old-time-1.0.0.4 ... linking ... done.
Loading package unix-2.4.0.1 ... linking ... done.
Loading package directory-1.0.1.1 ... linking ... done.
Loading package pretty-1.0.1.1 ... linking ... done.
Loading package process-1.0.1.2 ... linking ... done.
ghc-stage2: internal error: Invalid Mach-O file:Address out of bounds while relocating object file
    (GHC version 6.13.20100501 for i386_apple_darwin)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Abort trap
thorkil-naurs-intel-mac-mini:~/tn/builders/GHCBuilder/tn23/builder/tempbuild/build thorkilnaur$

The error message is issued by rts/Linker.c:

static unsigned long relocateAddress(
    ObjectCode* oc,
    int nSections,
    struct section* sections,
    unsigned long address)
{
    int i;
    for(i = 0; i < nSections; i++)
    {
        if(sections[i].addr <= address
            && address < sections[i].addr + sections[i].size)
        {
            return (unsigned long)oc->image
                    + sections[i].offset + address - sections[i].addr;
        }
    }
    barf("Invalid Mach-O file:"
         "Address out of bounds while relocating object file");
    return 0;
}

I also found that after unpulling the patch

Fri Apr 23 14:48:53 CEST 2010  Simon Marlow <marlowsd@gmail.com>
  * reinstate eta-expansion during SimplGently, to fix inlining of sequence_

the build succeeded. But I am not sure whether this has any real relevance or is just a shuffling of generated code that causes this weakness in the Mach-O part of Linker.c to surface.

Best regards Thorkil

comment:3 Changed 7 years ago by chak

Another reason to get rid of the RTS linker in favour of dyld.

comment:4 Changed 7 years ago by PHO

Cc: pho@… added

comment:5 Changed 7 years ago by PHO

I have examined the problem a bit:

Linker.c:4265:

#ifdef powerpc_HOST_ARCH
		    else if(scat->r_type == PPC_RELOC_SECTDIFF
		        || scat->r_type == PPC_RELOC_LO16_SECTDIFF
		        || scat->r_type == PPC_RELOC_HI16_SECTDIFF
		        || scat->r_type == PPC_RELOC_HA16_SECTDIFF
			|| scat->r_type == PPC_RELOC_LOCAL_SECTDIFF)
#else
                    else if(scat->r_type == GENERIC_RELOC_SECTDIFF
                        || scat->r_type == GENERIC_RELOC_LOCAL_SECTDIFF)
#endif
		    {
		        struct scattered_relocation_info *pair =
		                (struct scattered_relocation_info*) &relocs[i+1];

		        if(!pair->r_scattered || pair->r_type != GENERIC_RELOC_PAIR)
		            barf("Invalid Mach-O file: "
		                 "RELOC_*_SECTDIFF not followed by RELOC_PAIR");

		        word = (unsigned long)
		               (relocateAddress(oc, nSections, sections, scat->r_value)
		              - relocateAddress(oc, nSections, sections, pair->r_value));
		        i++;
		    }

The relocation problem occured at the following line:

		              - relocateAddress(oc, nSections, sections, pair->r_value));

Investigating further with gdb:

(gdb) p *scat
$11 = {
  r_scattered = 1, 
  r_pcrel = 0, 
  r_length = 2, 
  r_type = 8, // PPC_RELOC_SECTDIFF
  r_address = 2237192, // 0x00222308
  r_value = 3585652
}

(gdb) p *pair
$13 = {
  r_scattered = 1, 
  r_pcrel = 0, 
  r_length = 2, 
  r_type = 1, 
  r_address = 0, 
  r_value = -60264 // wtf?
}

(gdb) p (unsigned long)pair->r_value
$43 = 4294907032 // 0xffff1498: Way too large.

comment:6 Changed 7 years ago by PHO

So I suspected that it was a memory corruption. But...

% otool -r /path/to/the/problematic/HSCabal-1.9.0.o
...
Relocation information (__TEXT,__text) 447043 entries
address  pcrel length extern type    scattered symbolnum/value
...
00222308 0     2      n/a    8       1         0x0036b674
00000000 0     2      n/a    1       1         0xffff1498  // Exactly same as I saw with gdb!
...

% otool -t -v /path/to/the/problematic/HSCabal-1.9.0.o
...
___stginit_Cabalzm1zi9zi0_DistributionziSimpleziPackageIndex:
00222304        b       0x2221f8
l1000:
00222308        .long 0x0037a1dc // The strange reloc address.
0022230c        .long 0x00000000
00222310        .long 0x00200001
...
_Cabalzm1zi9zi0_DistributionziSimpleziPreProcess_zdwa1_info:
00222468        addi    r25,r25,0x2c
0022246c        lwz     r31,0x5c(r27)
...

% otool -r /path/to/the/problematic/Distribution/Simple/PreProcess.o
...
00000000 0     2      n/a    8       1         0x0000e240
00000000 0     2      n/a    1       1         0xfffffffc // Way too large.
...

The relocation table was actually broken.

comment:7 Changed 7 years ago by PHO

And I found that I could work around this problem with "SplitObjs = NO". I have currently no idea why this occurs. Is there something wrong with the evil splitter?

comment:8 Changed 7 years ago by bos

Cc: bos@… added

comment:10 Changed 7 years ago by tibbe

Cc: johan.tibell@… added

comment:11 Changed 7 years ago by dgoldsmith

Cc: dgoldsmith@… added

comment:12 Changed 7 years ago by igloo

Owner: set to igloo

comment:13 Changed 7 years ago by igloo

Hmm, can anyone see what's going on with the 0x0000000c vs 0xfffffffc here?:

$ cat q.s

.const_data
foo:
        .long  123
        .align 2
.text
        .align 2
        .long   foo - bar + 16
        .long   0
        .long   125
bar:
        .long 124

$ gcc -c q.s
$ ld -r q.o -o w.o
$ otool -r ?.o
q.o:
Relocation information (__TEXT,__text) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000000 0     2      n/a    4       1         0x00000010
00000000 0     2      n/a    1       1         0x0000000c
w.o:
Relocation information (__TEXT,__text) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000000 0     2      n/a    4       1         0x00000010
00000000 0     2      n/a    1       1         0xfffffffc
$ 

comment:14 Changed 7 years ago by igloo

Owner: igloo deleted
Priority: highestnormal

I've disabled object splitting on OS X as a workaround.

A proper fix from an OS X expert would be much better, though.

comment:15 Changed 7 years ago by simonpj

But q.s looks bogus. foo is in in .const_data section but bar is in .text section. You can't compute an offset (foo-bar) between the two!

Changed 7 years ago by igloo

Attachment: ghc66215_0.split_s.gz added

Changed 7 years ago by igloo

Attachment: ghc66215_0.split__1.s added

comment:16 Changed 7 years ago by igloo

I've attached a .split_s and .split__1.s file that show the same problem, with e.g. _Cabalzm1zi9zi2_Foo_chattyTry1_srt and _s4Nt_info.

comment:17 Changed 7 years ago by igloo

Milestone: 7.0.17.0.2

comment:17 Changed 7 years ago by thorkilnaur

With my PPC Mac OS X

$ uname -a
Darwin thorkil-naurs-mac-mini.local 9.8.0 Darwin Kernel Version 9.8.0: Wed Jul 15 16:57:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_PPC Power Macintosh
$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.0.1
$ 

validating a recent HEAD, the same thing happens, even with, presumably, no object splitting:

"inplace/bin/ghc-stage2"   -H32m -O -Wall -Werror -H64m -O0    -package-name dph-seq-0.5 -hide-all-packages -i -ilibraries/dph/dph-seq/../dph-common -ilibraries/dph/dph-seq/dist-install/build -ilibraries/dph/dph-seq/dist-install/build/autogen -Ilibraries/dph/dph-seq/dist-install/build -Ilibraries/dph/dph-seq/dist-install/build/autogen -Ilibraries/dph/dph-seq/.    -optP-include -optPlibraries/dph/dph-seq/dist-install/build/autogen/cabal_macros.h -package array-0.3.0.2 -package base-4.3.1.0 -package dph-base-0.5 -package dph-prim-seq-0.5 -package ghc-7.1.20101221 -package ghc-prim-0.2.0.0 -package random-1.0.0.3 -package template-haskell-2.5.0.0  -Odph -funbox-strict-fields -fcpr-off -fdph-this -package-name dph-seq -XTypeFamilies -XGADTs -XRankNTypes -XBangPatterns -XMagicHash -XUnboxedTuples -XTypeOperators -no-user-package-conf -rtsopts -O2 -XGenerics -O -dcore-lint -fno-warn-deprecated-flags -Wwarn    -odir libraries/dph/dph-seq/dist-install/build -hidir libraries/dph/dph-seq/dist-install/build -stubdir libraries/dph/dph-seq/dist-install/build -hisuf hi -osuf  o -hcsuf hc -c libraries/dph/dph-seq/../dph-common/Data/Array/Parallel/Lifted/PArray.hs -o libraries/dph/dph-seq/dist-install/build/Data/Array/Parallel/Lifted/PArray.o
...
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package array-0.3.0.2 ... linking ... done.
Loading package containers-0.4.0.0 ... linking ... done.
Loading package filepath-1.2.0.0 ... linking ... done.
Loading package old-locale-1.0.0.2 ... linking ... done.
Loading package old-time-1.0.0.6 ... linking ... done.
Loading package unix-2.4.1.0 ... linking ... done.
Loading package directory-1.1.0.0 ... linking ... done.
Loading package pretty-1.0.1.2 ... linking ... done.
Loading package process-1.0.1.4 ... linking ... done.
Loading package Cabal-1.11.0 ... linking ... ghc-stage2: internal error: Invalid Mach-O file:Address out of bounds while relocating object file
    (GHC version 7.1.20101221 for powerpc_apple_darwin)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
...
make[1]: *** [libraries/dph/dph-seq/dist-install/build/Data/Array/Parallel/Lifted/PArray.o] Abort trap
make[1]: *** Waiting for unfinished jobs....
...
make: *** [all] Error 2
$

Reducing the problem slightly:

$ "inplace/bin/ghc-stage2" --interactive -package Cabal-1.11.0
GHCi, version 7.1.20101221: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
...
Loading package Cabal-1.11.0 ... linking ... ghc-stage2: internal error: Invalid Mach-O file:Address out of bounds while relocating object file
    (GHC version 7.1.20101221 for powerpc_apple_darwin)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Abort trap
$

Introducing some additional debugging information:

$ darcs whatsnew -u
hunk ./rts/Linker.c 4576
     unsigned long address)
 {
     int i;
-    IF_DEBUG(linker, debugBelch("relocateAddress: start\n"));
+    IF_DEBUG(linker, debugBelch("relocateAddress: Looking for 0x%lx through %d sections\n", address, nSections));
     for (i = 0; i < nSections; i++)
     {
hunk ./rts/Linker.c 4579
-            IF_DEBUG(linker, debugBelch("    relocating address in section %d\n", i));
+            IF_DEBUG(linker, debugBelch("    relocating address in section %d (0x%x..0x%x of size 0x%x)\n", i, sections[i].addr, sections[i].addr + sections[i].size, sections[i].size));
         if (sections[i].addr <= address
             && address < sections[i].addr + sections[i].size)
         {
hunk ./rts/Linker.c 4619
 
     for(i=0;i<n;i++)
     {
+        IF_DEBUG(linker, debugBelch("relocateSection: Relocating %d of %d\n", i, n));
 #ifdef x86_64_HOST_ARCH
         struct relocation_info *reloc = &relocs[i];
 
hunk ./rts/Linker.c 4681
             IF_DEBUG(linker, debugBelch("               : desc  = %d\n", symbol->n_desc));
             IF_DEBUG(linker, debugBelch("               : value = %p\n", (void *)symbol->n_value));
             if ((symbol->n_type & N_TYPE) == N_SECT) {
+                IF_DEBUG(linker, debugBelch("relocateSection: relocateAddress1( %p )\n", (void *)symbol->n_value));
                 value = relocateAddress(oc, nSections, sections,
                                         symbol->n_value);
                 IF_DEBUG(linker, debugBelch("relocateSection, defined external symbol %s, relocated address %p\n", nm, (void *)value));
hunk ./rts/Linker.c 4774
                     // Step 1: Figure out what the relocated value should be
                     if(scat->r_type == GENERIC_RELOC_VANILLA)
                     {
+                        IF_DEBUG(linker, debugBelch("relocateSection: relocateAddress2( %p )\n", (void *)scat->r_value));
                         word = *wordPtr + (unsigned long) relocateAddress(
                                                                 oc,
                                                                 nSections,
hunk ./rts/Linker.c 4800
                             barf("Invalid Mach-O file: "
                                  "RELOC_*_SECTDIFF not followed by RELOC_PAIR");
 
+                        IF_DEBUG(linker, debugBelch("relocateSection: relocateAddress3( %p )\n", (void *)scat->r_value));
+                        IF_DEBUG(linker, debugBelch("               : relocateAddress4( %p )\n", (void *)pair->r_value));
                         word = (unsigned long)
                                (relocateAddress(oc, nSections, sections, scat->r_value)
                               - relocateAddress(oc, nSections, sections, pair->r_value));
hunk ./rts/Linker.c 4841
                         }
 
 
+                        IF_DEBUG(linker, debugBelch("relocateSection: relocateAddress5( %p )\n", (void *)scat->r_value));
                         word += (unsigned long) relocateAddress(oc, nSections, sections, scat->r_value)
                                                 - scat->r_value;
 
$ 

Then, having built GHC with GhcDebugged=YES:

$ "inplace/bin/ghc-stage2" --interactive -package Cabal-1.11.0 +RTS -Dl 2>&1 | tail -22 
relocateSection: Relocating 171916 of 480560
lookupSymbol: looking up _filepathzm1zi2zi0zi0_SystemziFilePathziPosix_normalise_info
initLinker: start
initLinker: idempotent return
lookupSymbol: value of _filepathzm1zi2zi0zi0_SystemziFilePathziPosix_normalise_info is 0x56e707c
relocateSection: Relocating 171917 of 480560
relocateSection: relocateAddress3( 0x3fe298 )
               : relocateAddress4( 0xfffef2bc )
relocateAddress: Looking for 0x3fe298 through 5 sections
    relocating address in section 0 (0x0..0x3b9294 of size 0x3b9294)
    relocating address in section 1 (0x3b9294..0x3d0829 of size 0x17595)
    relocating address in section 2 (0x3d082c..0x3eff6c of size 0x1f740)
    relocating address in section 3 (0x3eff6c..0x402ee8 of size 0x12f7c)
relocateAddress: Looking for 0xfffef2bc through 5 sections
    relocating address in section 0 (0x0..0x3b9294 of size 0x3b9294)
    relocating address in section 1 (0x3b9294..0x3d0829 of size 0x17595)
    relocating address in section 2 (0x3d082c..0x3eff6c of size 0x1f740)
    relocating address in section 3 (0x3eff6c..0x402ee8 of size 0x12f7c)
    relocating address in section 4 (0x402ee8..0x402f18 of size 0x30)
ghc-stage2: internal error: Invalid Mach-O file:Address out of bounds while relocating object file
    (GHC version 7.1.20101221 for powerpc_apple_darwin)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
$ 

So, similar to the observation by PHO, the linker attempts to locate the address 0xfffef2bc within each of the 5 sections and fails.

To identify the file loaded:

$ "inplace/bin/ghc-stage2" --interactive -package Cabal-1.11.0 +RTS -Dl 2>&1 | grep loadObj
initLinker: inserting rts symbol _loadObj, 0x1da3ce0
initLinker: inserting rts symbol _unloadObj, 0x1da4178
Loading package ghc-prim ... loadObj .../libraries/ghc-prim/dist-install/build/HSghc-prim-0.2.0.0.o
loadObj done.
...
Loading package Cabal-1.11.0 ... loadObj .../libraries/Cabal/dist-install/build/HSCabal-1.11.0.o
loadObj done.
$ 

Now attacking HSCabal-1.11.0.o with otool -rV, we find:

0025f70c True  long   True   BR24    False     _filepathzm1zi2zi0zi0_SystemziFilePathziPosix_normalise_info
0025f6f8 False long   n/a    SECTDIF True      0x003fe298
         False byte   n/a    PAIR    True      0xfffef2bc

This matches the failing relocation addresses and gives us the address 0025f6f8 of the code that needs this relocation.

Attacking HSCabal-1.11.0.o again, this time with otool -tv, we find:

...
0025f6d8        lis     r31,ha16(___stginit_base_DataziMaybe_)
0025f6dc        addi    r31,r31,lo16(___stginit_base_DataziMaybe_)
0025f6e0        stw     r31,_Cabalzm1zi11zi0_DistributionziCompiler_zdfReadCompilerFlavor16_info_dsp(r22)
0025f6e4        addi    r22,r22,0x4
0025f6e8        lwz     r31,0xfffc(r22)
0025f6ec        mtspr   ctr,r31
0025f6f0        bctr
___stginit_Cabalzm1zi11zi0_DistributionziSimpleziPackageIndex:
0025f6f4        b       ___stginit_Cabalzm1zi11zi0_DistributionziSimpleziPackageIndex_
_s5eM_info_dsp:
0025f6f8        .long 0x0040efdc
0025f6fc        .long 0x00000000
0025f700        .long 0x00200001
_s5eM_info:
0025f704        or      r15,r14,r14
0025f708        addi    r22,r22,0x4
0025f70c        b       _filepathzm1zi2zi0zi0_SystemziFilePathziPosix_normalise_info
0025f710        .long 0x0000000c
_s4u6_info_dsp:
0025f714        .long 0x0019eb88
0025f718        .long 0x00020000
0025f71c        .long 0x00130001
_s4u6_info:
...

HSCabal-1.11.0.o is generated by:

"/Users/thorkilnaur/tn/bin/ld" -r -o libraries/Cabal/dist-install/build/HSCabal-1.11.0.o  libraries/Cabal/dist-install/build/Distribution/Compiler.o libraries/Cabal/dist-install/build/Distribution/InstalledPackageInfo.o libraries/Cabal/dist-install/build/Distribution/License.o libraries/Cabal/dist-install/build/Distribution/Make.o libraries/Cabal/dist-install/build/Distribution/ModuleName.o libraries/Cabal/dist-install/build/Distribution/Package.o libraries/Cabal/dist-install/build/Distribution/PackageDescription.o libraries/Cabal/dist-install/build/Distribution/PackageDescription/Configuration.o libraries/Cabal/dist-install/build/Distribution/PackageDescription/Parse.o libraries/Cabal/dist-install/build/Distribution/PackageDescription/Check.o libraries/Cabal/dist-install/build/Distribution/PackageDescription/PrettyPrint.o libraries/Cabal/dist-install/build/Distribution/ParseUtils.o libraries/Cabal/dist-install/build/Distribution/ReadE.o libraries/Cabal/dist-install/build/Distribution/Simple.o libraries/Cabal/dist-install/build/Distribution/Simple/Build.o libraries/Cabal/dist-install/build/Distribution/Simple/Build/Macros.o libraries/Cabal/dist-install/build/Distribution/Simple/Build/PathsModule.o libraries/Cabal/dist-install/build/Distribution/Simple/BuildPaths.o libraries/Cabal/dist-install/build/Distribution/Simple/Command.o libraries/Cabal/dist-install/build/Distribution/Simple/Compiler.o libraries/Cabal/dist-install/build/Distribution/Simple/Configure.o libraries/Cabal/dist-install/build/Distribution/Simple/GHC.o libraries/Cabal/dist-install/build/Distribution/Simple/LHC.o libraries/Cabal/dist-install/build/Distribution/Simple/Haddock.o libraries/Cabal/dist-install/build/Distribution/Simple/Hugs.o libraries/Cabal/dist-install/build/Distribution/Simple/Install.o libraries/Cabal/dist-install/build/Distribution/Simple/InstallDirs.o libraries/Cabal/dist-install/build/Distribution/Simple/JHC.o libraries/Cabal/dist-install/build/Distribution/Simple/LocalBuildInfo.o libraries/Cabal/dist-install/build/Distribution/Simple/NHC.o libraries/Cabal/dist-install/build/Distribution/Simple/PackageIndex.o libraries/Cabal/dist-install/build/Distribution/Simple/PreProcess.o libraries/Cabal/dist-install/build/Distribution/Simple/PreProcess/Unlit.o libraries/Cabal/dist-install/build/Distribution/Simple/Program.o libraries/Cabal/dist-install/build/Distribution/Simple/Program/Ar.o libraries/Cabal/dist-install/build/Distribution/Simple/Program/Builtin.o libraries/Cabal/dist-install/build/Distribution/Simple/Program/Db.o libraries/Cabal/dist-install/build/Distribution/Simple/Program/HcPkg.o libraries/Cabal/dist-install/build/Distribution/Simple/Program/Ld.o libraries/Cabal/dist-install/build/Distribution/Simple/Program/Run.o libraries/Cabal/dist-install/build/Distribution/Simple/Program/Script.o libraries/Cabal/dist-install/build/Distribution/Simple/Program/Types.o libraries/Cabal/dist-install/build/Distribution/Simple/Register.o libraries/Cabal/dist-install/build/Distribution/Simple/Setup.o libraries/Cabal/dist-install/build/Distribution/Simple/SrcDist.o libraries/Cabal/dist-install/build/Distribution/Simple/Test.o libraries/Cabal/dist-install/build/Distribution/Simple/UHC.o libraries/Cabal/dist-install/build/Distribution/Simple/UserHooks.o libraries/Cabal/dist-install/build/Distribution/Simple/Utils.o libraries/Cabal/dist-install/build/Distribution/System.o libraries/Cabal/dist-install/build/Distribution/TestSuite.o libraries/Cabal/dist-install/build/Distribution/Text.o libraries/Cabal/dist-install/build/Distribution/Verbosity.o libraries/Cabal/dist-install/build/Distribution/Version.o libraries/Cabal/dist-install/build/Distribution/Compat/ReadP.o libraries/Cabal/dist-install/build/Language/Haskell/Extension.o libraries/Cabal/dist-install/build/Distribution/GetOpt.o libraries/Cabal/dist-install/build/Distribution/Compat/Exception.o libraries/Cabal/dist-install/build/Distribution/Compat/CopyFile.o libraries/Cabal/dist-install/build/Distribution/Compat/TempFile.o libraries/Cabal/dist-install/build/Distribution/Simple/GHC/IPI641.o libraries/Cabal/dist-install/build/Distribution/Simple/GHC/IPI642.o libraries/Cabal/dist-install/build/Paths_Cabal.o      `/usr/bin/find libraries/Cabal/dist-install/build -name "*_stub.o" -print`

Running each of these input .o files through nm, looking for _s5eM_info_dsp and _filepathzm1zi2zi0zi0_SystemziFilePathziPosix_normalise_info, it appears that libraries/Cabal/dist-install/build/Distribution/Simple/PreProcess.o is the only .o-file that defines/uses both of these. Thus, we generate a suitable PreProcess.s and look at the initial part of it:

$ head -27 PreProcess.s 
.const_data
.align 2
.globl _Cabalzm1zi11zi0_DistributionziSimpleziPreProcess_zdwa_srt
_Cabalzm1zi11zi0_DistributionziSimpleziPreProcess_zdwa_srt:
	.long	_Cabalzm1zi11zi0_DistributionziSimpleziUtils_die_closure
	.long	_Cabalzm1zi11zi0_DistributionziSimpleziUtils_writeUTF8File1_closure
	.long	_Cabalzm1zi11zi0_DistributionziSimpleziPreProcessziUnlit_unlit_closure
	.long	_Cabalzm1zi11zi0_DistributionziSimpleziUtils_withUTF8FileContents1_closure
	.long	_filepathzm1zi2zi0zi0_SystemziFilePathziPosix_normalise_closure
.data
.align 2
.globl _Cabalzm1zi11zi0_DistributionziSimpleziPreProcess_zdwa_closure
_Cabalzm1zi11zi0_DistributionziSimpleziPreProcess_zdwa_closure:
	.long	_Cabalzm1zi11zi0_DistributionziSimpleziPreProcess_zdwa_info
	.long	0
.text
.align 2
_s5eL_info_dsp:
	.long	_Cabalzm1zi11zi0_DistributionziSimpleziPreProcess_zdwa_srt-(_s5eL_info)+16
	.long	0
	.long	2097153
_s5eL_info:
Lc5D7:
	mr	r15, r14
	addi	r22, r22, 4
	b	_filepathzm1zi2zi0zi0_SystemziFilePathziPosix_normalise_info
	.long  _s5eL_info - _s5eL_info_dsp
$ 

where the problem occurs when relocating:

_s5eL_info_dsp:
	.long	_Cabalzm1zi11zi0_DistributionziSimpleziPreProcess_zdwa_srt-(_s5eL_info)+16

This is the pattern that Igloo reduced to its bare bones in his q.s example earlier.

I have checked the Apple documentation and, yes, simonpj, in http://developer.apple.com/library/mac/documentation/DeveloperTools/Reference/Assembler/Assembler.pdf it is stated that var-dat+5 is a

Relocatable expression if both var and dat are defined in the same section

Nevertheless, in /usr/include/mach-o/reloc.h on my PPC Mac, I read:

 * Another type of generic relocation, GENERIC_RELOC_SECTDIFF, is to support
 * the difference of two symbols defined in different sections.  That is the
 * expression "symbol1 - symbol2 + constant" is a relocatable expression when
 * both symbols are defined in some section. ...

For the record, here is the Igloo example repeated on the PPC Mac (Igloo's is an Intel Mac, as far as I know):

$ cat q.s 
.const_data
foo:
        .long  123
        .align 2
.text
        .align 2
        .long   foo - bar + 16
        .long   0
        .long   125
bar:
        .long 124
$ gcc -c q.s
$ ld -r q.o -o w.o
$ otool -rt q.o w.o
q.o:
Relocation information (__TEXT,__text) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000000 0     2      n/a    15      1         0x00000010
00000000 0     2      n/a    1       1         0x0000000c
(__TEXT,__text) section
00000000 00000014 00000000 0000007d 0000007c 
w.o:
Relocation information (__TEXT,__text) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000000 0     2      n/a    8       1         0x00000010
00000000 0     3      n/a    1       1         0xfffffffc
(__TEXT,__text) section
00000000 00000014 00000000 0000007d 0000007c 
$ 

There are differences between the PPC Mac and the Intel Mac, but the central mystery, the symbolnum/value of 0xfffffffc (=-4 with 2-complement sign interpretation) remains. I have studied various sources to try to make sense of such a value as the second of a relocation pair (such as this is, see /usr/include/mach-o/reloc.h): See http://developer.apple.com/library/mac/#documentation/DeveloperTools/Conceptual/MachOTopics/1-Articles/dynamic_code.html, http://developer.apple.com/library/mac/#documentation/DeveloperTools/Conceptual/MachORuntime Reference/reference.html, /usr/include/mach-o/ppc/reloc.h, /usr/include/mach-o/loader.h, and http://www.opensource.apple.com/source/cctools/cctools-750/ld/ppc_reloc.c. However, I have not been able to come to a definite conclusion.

I have a vague idea that, somehow, during the processing of the code, the expression (bar-16) is formed where bar represents the offset of bar: in it's (the .text) segment/section and that this expression, in this particular case, because we are very close to the beginning of the .text segment/section, becomes negative, causing the trouble. This is consistent with the experience that the problem has a greater chance of appearing when object splitting is enabled, because then many more beginnings of segments/sections are presumably involved.

A test of this idea would be something like this:

$ cat r.s
.const_data
foo:
        .long  123
        .align 2
.text
        .align 2
        .long   foo - bar + 16
        .long   0
        .long   125
        .long   126
        .long   127
bar:
        .long 124
$ gcc -c r.s
$ ld -r r.o -o x.o
$ otool -rt r.o x.o
r.o:
Relocation information (__TEXT,__text) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000000 0     2      n/a    15      1         0x00000018
00000000 0     2      n/a    1       1         0x00000014
(__TEXT,__text) section
00000000 00000014 00000000 0000007d 0000007e 
00000010 0000007f 0000007c 
x.o:
Relocation information (__TEXT,__text) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000000 0     2      n/a    8       1         0x00000018
00000000 0     3      n/a    1       1         0x00000004
(__TEXT,__text) section
00000000 00000014 00000000 0000007d 0000007e 
00000010 0000007f 0000007c 
$ 

With bar now at offset 20 within the .text, the symbolnum/value remains non-negative and therefore, presumably, steers clear of the particular problem that we are facing. On the other hand, the x.o symbolnum/value is still, mysteriously, different from the r.o symbolnum/value, so whether this will work, is not clear.

Another path suggested by the PreProcess.s code is to force the addition foo+16, something along these lines:

$ cat s.s
.const_data
foo:
        .long  123
        .long  126
        .long  127
        .long  128
        .long  129
        .align 2
.text
        .align 2
foo2 = foo + 16
        .long   foo2 - bar
        .long   0
        .long   125
bar:
        .long 124
$ gcc -c s.s
$ ld -r s.o -o y.o
$ otool -rt s.o y.o
s.o:
Relocation information (__TEXT,__text) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000000 0     2      n/a    15      1         0x00000020
00000000 0     2      n/a    1       1         0x0000000c
(__TEXT,__text) section
00000000 00000014 00000000 0000007d 0000007c 
y.o:
Relocation information (__TEXT,__text) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000000 0     2      n/a    8       1         0x00000020
00000000 0     3      n/a    1       1         0x0000000c
(__TEXT,__text) section
00000000 00000014 00000000 0000007d 0000007c 
$ 

In s.s, the lines with .long constants 126-129 are needed to satisfy, intially, the linker and, finally, the assembler.

And here we see, finally, a consistent pattern in both of the .o files. It would appear possible to introduce such a work-around in the GHC native code generator.

All through this, note that the dumped value of foo-bar+16, the first word of the dumped .text, remains constant at 0x00000014. Solving foo-bar+16 = 0x14 = 20, we see that this implies foo = bar+4. In other words, the .const_data is assumed to be immediately following the last word, the .long 124, of the .text. This is confirmed by dissecting the involved .o files in detail.

Best regards Thorkil

comment:18 Changed 7 years ago by igloo

Milestone: 7.0.27.0.3
Priority: normalhigh

Thanks for the analysis, Thorkil!

comment:19 Changed 7 years ago by igloo

Owner: set to igloo

comment:20 in reply to:  19 Changed 7 years ago by gwright

Replying to igloo: Is there a canonical example of this bug? If you can tell me which ghc version shows it I can look at it, since I have access to both 10.5 (x86_64) and 10.6 (x86/x86_64) systems.

If you'd like, I can take this bug since I've been "thinkin' about linkin'".

comment:21 Changed 7 years ago by igloo

Resolution: fixed
Status: newclosed

I've just successfully validated with

SupportsSplitObjs = YES
SplitObjs = YES

in mk/validate.mk, both 32bit and 64bit. Maybe it's a bug that was fixed in 10.6, or in a newer XCode?

Hmmm:

$ cat q.s
.const_data
foo:
        .long  123
        .align 2
.text
        .align 2
        .long   foo - bar + 16
        .long   0
        .long   125
bar:
        .long 124

$ gcc -c q.s
$ ld -r q.o -o w.o
Segmentation fault
$ ls
q.o     q.s
$ file q.o
q.o: Mach-O 64-bit object x86_64
$ 

But it now looks sensible in 32bit mode:

$ gcc -c q.s -m32
$ ld -r q.o -o w.o
$ otool -r ?.o
q.o:
Relocation information (__TEXT,__text) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000000 0     2      n/a    4       1         0x00000010
00000000 0     2      n/a    1       1         0x0000000c
w.o:
Relocation information (__TEXT,__text) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000000 0     2      n/a    4       1         0x00000010
00000000 0     2      n/a    1       1         0x0000000c
$

I've pushed

Sat Feb 19 19:14:09 GMT 2011  Ian Lynagh <igloo@earth.li>
  * Reenable object splitting on Darwin, now #4013 appears to be fixed

so we'll see if any remaining issues arise. Otherwise, it looks like this was fixed by upstream.

comment:22 Changed 7 years ago by chak

Owner: igloo deleted
Resolution: fixed
Status: closednew

comment:23 Changed 7 years ago by thorkilnaur

Hello,

I am looking into possibly upgrading the tn23 Xcode. As I understand, also from a discussion on #ghc the other day, this may fix the problem.

Best regards Thorkil

comment:24 Changed 7 years ago by igloo

Resolution: fixed
Status: newclosed

Should be fixed for old versions of OS X by

Fri Feb 25 18:43:58 GMT 2011  Ian Lynagh <igloo@earth.li>
  * Turn off split objects on Darwin if XCode < 3.2 (#4013)

comment:25 Changed 7 years ago by chak

Resolution: fixed
Status: closednew

The last patch doesn't seem to have helped: http://darcs.haskell.org/ghcBuilder/builders/tn23/271/8.html

comment:26 Changed 7 years ago by igloo

Resolution: fixed
Status: newclosed

Now really fixed:

Mon Mar  7 22:58:23 GMT 2011  Ian Lynagh <igloo@earth.li>
  * Improve the XCode version detection
  Amongst other improvements, we now handle 3-component versions
  (like "3.1.4") correctly.

http://darcs.haskell.org/ghcBuilder/builders/tn23/280.html

comment:27 Changed 12 months ago by bgamari

It seems that Apple has broken split-sections again; see #12479.

Also, it's not clear that an autoconf check is enough to guarantee that users aren't affected by this issue. There's no reason to believe that the linker at configure-time is the same linker that will be used by the built compiler (consider, for instance, the case of a binary distribution built on a working toolchain but downloaded and used on a broken toolchain).

I think we have little choice but to just disable split sections on OS X.

Note: See TracTickets for help on using tickets.