Opened 14 months ago

Last modified 3 months ago

#8974 infoneeded bug

64 bit windows executable built with ghc-7.9.20140405+LLVM segfaults

Reported by: awson Owned by:
Priority: high Milestone: 7.12.1
Component: Compiler (LLVM) Version: 7.9
Keywords: Cc: dterei, simonmar
Operating System: Windows Architecture: x86_64 (amd64)
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

-- test.hs
import System.Mem (performMajorGC)

main = performMajorGC >> putStrLn "Done"

built with ghc -pgmlo opt -pgmlc llc -fllvm --make test.hs segfaults both for LLVM 3.4 and 3.5svn (taken from http://sourceforge.net/projects/msys2/files/REPOS/MINGW/x86_64).

32-bit ghc-7.9.20140404+llvm produces good executable.

Adding ArchX86_64 OSMinGW32 case to moduleLayout in compiler/llvmGen/LlvmCodeGen/Ppr.hs improves things slightly (some code segfaulting without it starts to work) but still does not cure the code above.

Also I've found the Cmm produced for LLVM CG differs from that produced for NCG.

Attachments (4)

T8947.ll (19.2 KB) - added by awson 14 months ago.
T8947_LLVMCG_cmm (32.9 KB) - added by awson 14 months ago.
T8947_NCG_cmm (26.6 KB) - added by awson 14 months ago.
ghc-w64-llvm34_v2.patch (4.7 KB) - added by awson 13 months ago.

Download all attachments as: .zip

Change History (32)

comment:1 follow-up: Changed 14 months ago by carter

CMM is generated *before* the NCG and LLVM backends... so is there some code path before the code gen that depends on which code gen is selected?

comment:2 Changed 14 months ago by ezyang

It would be helpful if you could post the C-- produced.

comment:3 Changed 14 months ago by ezyang

  • Cc dterei added

Changed 14 months ago by awson

Changed 14 months ago by awson

Changed 14 months ago by awson

comment:4 Changed 14 months ago by awson

To make Cmm shorter I've separated segfaulting code from main (it still segfaults when called from main) thus:

-- T8947.hs
module T8947 where

import System.Mem (performMajorGC)

t8947 :: IO ()
t8947 = performMajorGC >> putStrLn "Done"

T8947_LLVMCG_cmm T8947.ll are produced by ghc -O2 -pgmlo opt -pgmlc llc -fllvm -keep-llvm-files -ddump-cmm -c T8947.hs > T8947_LLVMCG_cmm.
T8947_NCG_cmm is produced by ghc -O2 -ddump-cmm -c T8947.hs > T8947_NCG_cmm.

comment:5 Changed 14 months ago by awson

Perhaps it would be interesting:

performMajorGC alone and putStrLn "Done" alone works.

putStrLn "Done" >> performMajorGC works.

And

foreign import ccall unsafe puts :: Ptr a -> IO ()

performMajorGC >> puts (Ptr "Done"#)

works too.

comment:6 in reply to: ↑ 1 Changed 14 months ago by jstolarek

Replying to carter:

CMM is generated *before* the NCG and LLVM backends... so is there some code path before the code gen that depends on which code gen is selected?

Yes, there is.

comment:8 Changed 14 months ago by awson

I've tried to make LLVM codegen to not trash anything (getTrashRegs = return []) but the problem is still here. Hence either my analysis is wrong or incomplete.

comment:9 Changed 14 months ago by thoughtpolice

  • Milestone changed from 7.8.2 to 7.8.3

comment:10 Changed 13 months ago by awson

Well, I've found the source of this bug. It turned out, windows does not like 64-bit offsets, perhaps, this is PE32+'s painful legacy.

Here is the difference between segfaulting and working (manually created) code:

--- T8947.s	2014-04-21 14:02:47.240488500 +0400
+++ T8947m.s	2014-04-21 15:22:41.951320900 +0400
@@ -85,7 +85,8 @@
 	.globl	T8947_t1_info_itable    # @T8947_t1_info_itable
 	.align	8
 T8947_t1_info_itable:
-	.quad	S1i6_srt-T8947_t1_info
+	.long	S1i6_srt-T8947_t1_info
+	.long	0
 	.quad	4294967299              # 0x100000003
 	.quad	0                       # 0x0
 	.quad	64424509455             # 0xf0000000f
@@ -145,7 +146,8 @@
 	.text
 	.align	8                       # @c1hV_info_itable
 c1hV_info_itable:
-	.quad	S1i6_srt-c1hV_info
+	.long	S1i6_srt-c1hV_info
+	.long	0
 	.quad	0                       # 0x0
 	.quad	47244640288             # 0xb00000020
 
@@ -167,7 +169,8 @@
 	.globl	T8947_t8947_info_itable # @T8947_t8947_info_itable
 	.align	8
 T8947_t8947_info_itable:
-	.quad	(S1i6_srt-T8947_t8947_info)+16
+	.long	(S1i6_srt-T8947_t8947_info)+16
+	.long	0
 	.quad	4294967299              # 0x100000003
 	.quad	0                       # 0x0
 	.quad	4294967311              # 0x10000000f

Bad data are generated by the following llvm code:

...
@T8947_t1_info_itable = constant %T8947_t1_entry_struct<{i64 add (i64 sub (i64 ptrtoint (i8* @S1i6_srt$alias to i64),i64 ptrtoint (void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @T8947_t1_info to i64)),i64 0), i64 4294967299, i64 0, i64 64424509455}>, section "X98A__STRIP,__me3", align 8
...
@c1hV_info_itable = internal constant %c1hV_entry_struct<{i64 add (i64 sub (i64 ptrtoint (i8* @S1i6_srt$alias to i64),i64 ptrtoint (void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @c1hV_info to i64)),i64 0), i64 0, i64 47244640288}>, section "X98A__STRIP,__me5", align 8
...
@T8947_t8947_info_itable = constant %T8947_t8947_entry_struct<{i64 add (i64 sub (i64 ptrtoint (i8* @S1i6_srt$alias to i64),i64 ptrtoint (void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @T8947_t8947_info to i64)),i64 16), i64 4294967299, i64 0, i64 4294967311}>, section "X98A__STRIP,__me7", align 8
...

But I don't quite understand where in the GHC code shall I intervene precisely to fix it.

Last edited 13 months ago by awson (previous) (diff)

comment:11 Changed 13 months ago by awson

I think that things are pretty much explained in native codegen code.

But AFAIUI, when the relevant LLVM code was written, non-Windows binutils were already improved and that was not taken into account (Windows binutils are not fixable anyway in general).

Then it looks pprInfoTable code is the point we could try to rewrite things at.

Unfortunately, at this point we are forced to "reverse engineer" what was done before, and it is tempting to intervene here, but it seems we can't intervene at this early stage because this can break things in contexts other than pprInfoTable's one.

comment:12 follow-ups: Changed 13 months ago by awson

I've decided that intervening at `genStaticLit (CmmLabelDiffOff l1 l2 off)` is safe and have rewritten the code to generate 32-bit arithmetic and pointer conversion, but it turned out LLVM generates unsuitable code for ptrtoint ... to i32 applied to 64-bit pointer.

For example, if

@T8947_t1_info_itable = constant %T8947_t1_entry_struct<{i64 add (i64 sub (i64 ptrtoint (i8* @S1fL_srt$alias to i64),i64 ptrtoint (void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @T8947_t1_info to i64)),i64 0), i64 4294967299, i64 0, i64 64424509455}>, section "X98A__STRIP,__me3", align 8

gets rewritten to

@T8947_t1_info_itable = constant %T8947_t1_entry_struct<{i32 add (i32 sub (i32 ptrtoint (i8* @S1i6_srt$alias to i32),i32 ptrtoint (void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @T8947_t1_info to i32)),i32 0), i32 0, i64 4294967299, i64 0, i64 64424509455}>, section "X98A__STRIP,__me3", align 8

LLVM instead of

T8947_t1_info_itable:
	.quad	S1i6_srt-T8947_t1_info

generates (assembler spits Error: invalid operands (.rdata and *ABS* sections) for `&')

T8947_t1_info_itable:
	.long	(S1i6_srt&-1)-(T8947_t1_info&-1)
	.long	0                       # 0x0

while we want it to be

T8947_t1_info_itable:
	.long	S1i6_srt-T8947_t1_info
	.long	0                       # 0x0

I'm in no way an LLVM expert and know very little about it. Is there a way to make LLVM generate the code we want or are we use the mangler here? Any thoughts?

comment:13 Changed 13 months ago by awson

Ok. I've implemented the mangler based approach.

For what I've tested so far it works.

It's a bit ugly (UUID magic) and fragile (mangler searches and replaces crlf line ending dependent pattern) because I did not bother to elaborate trivial and boring details.

The patch below consists of 2 orthogonal parts:

  • the first introduces target datalayout and target triple for 64-bit mingw32 LLVM, it is compatible with LLVM 3.4 and incompatible with current LLVM 3.5svn (mingw32 was changed to windows-gnu in target triple).
  • the second essentially solves the problem, described in this ticket.

I've implemented all platform-specific code to be selected in runtime (I believe, LLVM can choose a target dynamically, am I wrong?). And I've tested all on 64-bit GHC 7.9+ and MSYS2 built LLVM 3.4 *only*.

comment:14 Changed 13 months ago by awson

Last edited 13 months ago by awson (previous) (diff)

comment:15 Changed 13 months ago by awson

  • Status changed from new to patch

Changed 13 months ago by awson

comment:16 Changed 12 months ago by simonmar

  • Cc simonmar added

comment:17 in reply to: ↑ 12 Changed 12 months ago by bgamari

Replying to awson:

I'm in no way an LLVM expert and know very little about it. Is there a way to make LLVM generate the code we want or are we use the mangler here? Any thoughts?

I'm not sure I understand why LLVM produces the assembler it does in this case. Have you tried bringing this up with the LLVM folks? It may be that the fix belongs in LLVM.

comment:18 Changed 12 months ago by altaic

LLVM has to be able to do a proper pointer cast. I'd definitely take Ben's advise and talk to the LLVM folks about this.

comment:19 Changed 12 months ago by thoughtpolice

  • Status changed from patch to infoneeded

comment:20 Changed 12 months ago by thoughtpolice

  • Milestone changed from 7.8.3 to 7.8.4

Moving to 7.8.4.

comment:21 in reply to: ↑ 12 ; follow-up: Changed 9 months ago by Fanael

Replying to awson:

For example, if

@T8947_t1_info_itable = constant %T8947_t1_entry_struct<{i64 add (i64 sub (i64 ptrtoint (i8* @S1fL_srt$alias to i64),i64 ptrtoint (void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @T8947_t1_info to i64)),i64 0), i64 4294967299, i64 0, i64 64424509455}>, section "X98A__STRIP,__me3", align 8

gets rewritten to

@T8947_t1_info_itable = constant %T8947_t1_entry_struct<{i32 add (i32 sub (i32 ptrtoint (i8* @S1i6_srt$alias to i32),i32 ptrtoint (void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @T8947_t1_info to i32)),i32 0), i32 0, i64 4294967299, i64 0, i64 64424509455}>, section "X98A__STRIP,__me3", align 8

LLVM instead of

T8947_t1_info_itable:
	.quad	S1i6_srt-T8947_t1_info

generates (assembler spits Error: invalid operands (.rdata and *ABS* sections) for `&')

T8947_t1_info_itable:
	.long	(S1i6_srt&-1)-(T8947_t1_info&-1)
	.long	0                       # 0x0

while we want it to be

T8947_t1_info_itable:
	.long	S1i6_srt-T8947_t1_info
	.long	0                       # 0x0

I'm in no way an LLVM expert and know very little about it. Is there a way to make LLVM generate the code we want or are we use the mangler here? Any thoughts?

Yes, there is. Use trunc, for example:

%foo = type <{i32, i32}>
@aaa = global i32 5
@bbb = global i32 5
@foo = constant %foo<{i32 trunc(i64 sub(i64 ptrtoint (i32* @aaa to i64), i64 ptrtoint (i32* @bbb to i64)) to i32), i32 0}>

LLVM will generate

foo:
	.long	aaa-bbb
	.long	0                       # 0x0

comment:22 in reply to: ↑ 21 ; follow-up: Changed 9 months ago by awson

Replying to Fanael:

Yes, there is. Use trunc, for example:

%foo = type <{i32, i32}>
@aaa = global i32 5
@bbb = global i32 5
@foo = constant %foo<{i32 trunc(i64 sub(i64 ptrtoint (i32* @aaa to i64), i64 ptrtoint (i32* @bbb to i64)) to i32), i32 0}>

LLVM will generate

foo:
	.long	aaa-bbb
	.long	0                       # 0x0

AFAIR, the trunc alone is not sufficient. What you propose in fact is to declare a pair of 32-bit int variables instead of one 64-bit pointer global variable, right?

comment:23 in reply to: ↑ 22 Changed 9 months ago by Fanael

Replying to awson:

AFAIR, the trunc alone is not sufficient. What you propose in fact is to declare a pair of 32-bit int variables instead of one 64-bit pointer global variable, right?

As shown in the example, yes. The ideal solution would be to convert that truncated value back to i64, but even though the LLVM docs say that sext and zext are valid constant expressions, they don't work and yield an "Unsupported expression in static initializer" error.

comment:24 Changed 7 months ago by thoughtpolice

  • Milestone changed from 7.8.4 to 7.10.1

Moving (in bulk) to 7.10.4

comment:25 Changed 4 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1

comment:26 Changed 4 months ago by bgamari

Do we know exactly why Windows doesn't like 64-bit offsets? Presumably it has some support for this, no?

comment:27 Changed 3 months ago by Fanael

Replying to bgamari:

Do we know exactly why Windows doesn't like 64-bit offsets?

It's not Windows, it's binutils being broken (nothing new). I'm using Git version, so it's no just a problem with 2.22 or whatever's the ancient version of binutils GHC is bundled with.

For example, LLVM emits this:

	.quad	S1kt_srt$def-Main_main1_info$def # @"Main_main1_info$def"
	.quad	4294967299              # 0x100000003
	.quad	0                       # 0x0
	.quad	64424509455             # 0xf0000000f
Main_main1_info$def:

With GDB, we can learn that this value should equal

(gdb) p ((char*)&S1kt_srt$def - (char*)&Main_main1_info$def)
$1 = 3062336

But what actually lands in the executable is

(gdb) x/d Main_main1_info$def - 32
0x4015b0 <Main_main2_info$def+96>:      3062340

Makes me wonder if binutils devs are aware of the problem.

Last edited 3 months ago by Fanael (previous) (diff)

comment:28 Changed 3 months ago by Fanael

Note: See TracTickets for help on using tickets.