Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#7510 closed bug (fixed)

Immediate seg-fault on 32-bit windows build

Reported by: simonpj Owned by: simonmar
Priority: highest Milestone: 7.8.1
Component: Compiler Version: 7.6.1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:


On 32-bit Windows (built with MSYS), GHC crashes almost immediately when used with -prof, -debug, -threaded or any other "way" flag. Ian boiled it down as follows.

It seems that the commits


introduced the problem, but I'm not sure whether they are the direct cause or just happened to tickle a pre-existing bug. Attached is a self-contained module that goes wrong:

$ inplace/bin/ghc-stage1 -debug --make test -O -fforce-recomp $ ./test +RTS -DS; echo $?
cap 0: initialised
Segmentation fault/access violation in generated code

Works with 7.4.1:

$ ghc -debug --make test -O -fforce-recomp $ ./test; echo $?

It's quite fragile, e.g. if I remove

    ghcMode               :: (),

from the DynFlags type then it succeeds.

Attachments (2)

test.hs (9.6 KB) - added by simonpj 2 years ago.
test-reduced.hs (5.8 KB) - added by joeyadams 2 years ago.
Simplified testcase (compile with ghc-stage1 -O0 -fforce-recomp test-reduced.hs)

Download all attachments as: .zip

Change History (18)

Changed 2 years ago by simonpj

comment:1 Changed 2 years ago by simonpj

  • Milestone set to 7.8.1
  • Priority changed from normal to highest

Changed 2 years ago by joeyadams

Simplified testcase (compile with ghc-stage1 -O0 -fforce-recomp test-reduced.hs)

comment:2 Changed 2 years ago by joeyadams

I was able to simplify Simon's standalone testcase quite a bit, and still get it to segfault (test-reduced.hs). The segfault occurs if you build without optimization (-O0), and occurs on both Windows and Linux 32-bit. It does not occur on Linux 64-bit.

However, there's another symptom: -ddump-cmm produces a much larger file (roughly 5 MB instead of 250 KB) with GHC HEAD than with 7.4.2 or 7.6.1. The size of the -ddump-cmm output scales with the number of fields in data DynFlags (about 35 KB per field). This symptom occurs on both 32-bit and 64-bit.

Speculation: there's a bug in how record update is compiled. This bug is not platform- or architecture-specific, though it manifests itself on x86 32-bit by clobbering the call stack.

comment:3 Changed 2 years ago by joeyadams

Actually, I get a big cmm dump if I enable the new codegen on GHC 7.4.1 or 7.6.1:

ghc-7.6.1 -O0 -fnew-codegen -ddump-cmmz -fforce-recomp test-reduced.hs > cmm

However, the resulting program does not segfault.

comment:4 Changed 2 years ago by joeyadams

I think I found the bug, introduced by commit 0b0a41f: "Teach the linear register allocator how to allocate more stack if necessary"

Somewhere in compiler/nativeGen/AsmCodeGen.lhs:

-- do linear register allocation
let reg_alloc proc = do
       (alloced, maybe_more_stack, ra_stats) <-
               Linear.regAlloc dflags proc
       case maybe_more_stack of
         Nothing -> return ( alloced, ra_stats )
         Just amount ->
           return ( ncgAllocMoreStack ncgImpl amount alloced
                  , ra_stats )

ncgAllocMoreStack is implemented by X86.Instr.allocMoreStack:

  :: Platform
  -> Int
  -> NatCmmDecl statics X86.Instr.Instr
  -> NatCmmDecl statics X86.Instr.Instr

allocMoreStack _ _ top@(CmmData _ _) = top
allocMoreStack platform amount (CmmProc info lbl live (ListGraph code)) =
        CmmProc info lbl live (ListGraph (map insert_stack_insns code))
    alloc   = mkStackAllocInstr platform amount
    dealloc = mkStackDeallocInstr platform amount

mkStackAllocInstr is implemented by x86_mkStackAllocInstr:

        :: Platform
        -> Int
        -> Instr
x86_mkStackAllocInstr platform amount
  = case platformArch platform of
      ArchX86    -> SUB II32 (OpImm (ImmInt amount)) (OpReg esp)
      ArchX86_64 -> SUB II64 (OpImm (ImmInt amount)) (OpReg rsp)
      _ -> panic "x86_mkStackAllocInstr"

It looks to me like the amount is never multiplied to convert from slots to bytes.

comment:5 Changed 2 years ago by simonpj

  • Owner set to simonmar

Thanks Joey! Simon M: since you made the commit, might you look at Joey's analysis? If he's right the solution should be easy.

It's quite urgent to fix this, since it utterly breaks the Windows build.



comment:6 Changed 2 years ago by igloo

Should we make newtypes of Int called Bytes and Words, so the typechecker can stop this sort of problem from occurring in the future?

comment:7 Changed 2 years ago by simonpj

Perhaps. We define type synonyms ByteOff and WordOff in SMRep for this purpose, and they are very effective as documentation though not enforced. I'm really not sure how much extra clutter newtypes would introduce.

The first thing is to fix the bug though!


comment:8 Changed 2 years ago by igloo

I just tried a build with

--- a/compiler/nativeGen/X86/Instr.hs
+++ b/compiler/nativeGen/X86/Instr.hs
@@ -741,22 +741,22 @@ x86_mkJumpInstr id
         :: Platform
-        -> Int
+        -> Int -- number of stack slots needed
         -> Instr
 x86_mkStackAllocInstr platform amount
   = case platformArch platform of
-      ArchX86    -> SUB II32 (OpImm (ImmInt amount)) (OpReg esp)
-      ArchX86_64 -> SUB II64 (OpImm (ImmInt amount)) (OpReg rsp)
+      ArchX86    -> SUB II32 (OpImm (ImmInt (4 * amount))) (OpReg esp)
+      ArchX86_64 -> SUB II64 (OpImm (ImmInt (8 * amount))) (OpReg rsp)
       _ -> panic "x86_mkStackAllocInstr"
         :: Platform
-        -> Int
+        -> Int -- number of stack slots needed
         -> Instr
 x86_mkStackDeallocInstr platform amount
   = case platformArch platform of
-      ArchX86    -> ADD II32 (OpImm (ImmInt amount)) (OpReg esp)
-      ArchX86_64 -> ADD II64 (OpImm (ImmInt amount)) (OpReg rsp)
+      ArchX86    -> ADD II32 (OpImm (ImmInt (4 * amount))) (OpReg esp)
+      ArchX86_64 -> ADD II64 (OpImm (ImmInt (8 * amount))) (OpReg rsp)
       _ -> panic "x86_mkStackDeallocInstr"

but that still segfaults when configuring the timeout program.

comment:9 Changed 2 years ago by igloo

But with

--- a/compiler/nativeGen/AsmCodeGen.lhs
+++ b/compiler/nativeGen/AsmCodeGen.lhs
@@ -189,7 +189,7 @@ x86_64NcgImpl dflags
        ,maxSpillSlots             = X86.Instr.maxSpillSlots dflags
        ,allocatableRegs           = X86.Regs.allocatableRegs platform
        ,ncg_x86fp_kludge          = id
-       ,ncgAllocMoreStack         = X86.Instr.allocMoreStack platform
+       ,ncgAllocMoreStack         = noAllocMoreStack -- X86.Instr.allocMoreStack platform
        ,ncgExpandTop              = id
        ,ncgMakeFarBranches        = id

the build fails with

"inplace/bin/ghc-stage1.exe" -static    -H32m -O -Werror -Wall -H64m -O0        -package-name Cabal-1.17.0  -hide-all-packages  -i -ilibraries/Cabal/Cabal/.  -ilibraries/Cabal/Cabal/dist-install/build -ilibraries/Cabal/Cabal/dist-install/build/autogen  -Ilibraries/Cabal/Cabal/dist-install/build -Ilibraries/Cabal/Cabal/dist-install/build/autogen  -Ilibraries/Cabal/Cabal/.        -optP-include -optPlibraries/Cabal/Cabal/dist-install/build/autogen/cabal_macros.h  -package array- -package base- -package bytestring- -package containers- -package deepseq- -package directory- -package filepath- -package pretty- -package process- -package time-    -fwarn-tabs -Wall -fno-ignore-asserts -XHaskell98 -XCPP -O2 -O -dcore-lint -fno-warn-deprecated-flags    -no-user-package-db -rtsopts  -w          -odir libraries/Cabal/Cabal/dist-install/build -hidir libraries/Cabal/Cabal/dist-install/build -stubdir libraries/Cabal/Cabal/dist-install/build  -hisuf hi -osuf  o -hcsuf hc -c libraries/Cabal/Cabal/./Distribution/Simple/SrcDist.hs -o libraries/Cabal/Cabal/dist-install/build/Distribution/Simple/SrcDist.o
ghc-stage1.exe: panic! (the 'impossible' happened)
  (GHC version 7.7.20130101 for i386-unknown-mingw32):
        Register allocator: out of stack slots (need 682)
   If you are trying to compile SHA1.hs from the crypto library then this
   is a known limitation in the linear allocator.

   Try enabling the graph colouring allocator with -fregs-graph instead.   You can still file a bug report if you like.

Please report this as a GHC bug:

so we are at least hitting this code, so it's plausible that it is to blame.

comment:10 Changed 2 years ago by joeyadams

I also tried a similar tweak to x86_mkStackAllocInstr and x86_mkStackDeallocInstr, but still got a segfault. I multiplied by 12 instead of 4, as suggested by the spillSlotSize function.

spillSlotSize :: DynFlags -> Int
spillSlotSize dflags = if is32Bit then 12 else 8
    where is32Bit = target32Bit (targetPlatform dflags)

Apparently, the problem is a little more complicated than a missing multiplication.

comment:11 Changed 2 years ago by igloo

I also tried kludging around it by adding Opt_RegsGraph to the default DynFlags, but the build then fails with

"inplace/bin/ghc-stage1.exe" -o utils/ghc-pkg/dist-install/build/tmp/ghc-pkg -static    -H32m -O -Werror -Wall -H64m -O0        -hide-all-packages  -i -iutils/ghc-pkg/.  -iutils/ghc-pkg/dist-install/build -iutils/ghc-pkg/dist-install/build/autogen  -Iutils/ghc-pkg/dist-install/build -Iutils/ghc-pkg/dist-install/build/autogen          -optP-include -optPutils/ghc-pkg/dist-install/build/autogen/cabal_macros.h  -package Cabal-1.17.0 -package base- -package bin-package-db- -package binary- -package bytestring- -package directory- -package filepath- -package process-    -XHaskell98 -XCPP -XForeignFunctionInterface -XNondecreasingIndentation    -no-user-package-db -rtsopts            -odir utils/ghc-pkg/dist-install/build -hidir utils/ghc-pkg/dist-install/build -stubdir utils/ghc-pkg/dist-install/build  -hisuf hi -osuf  o -hcsuf hc   utils/ghc-pkg/dist-install/build/Main.o utils/ghc-pkg/dist-install/build/Version.o utils/ghc-pkg/dist-install/build/CRT_noglob.o   
ghc-stage1.exe: panic! (the 'impossible' happened)
  (GHC version 7.7.20130101 for i386-unknown-mingw32):
        regSpill: out of spill slots!
       regs to spill = 1355
       slots left    = 677

Please report this as a GHC bug:

Aside from fixing the bug, should we look into why we're getting so much spilling?

comment:12 Changed 2 years ago by igloo

Same as #7498?

comment:13 Changed 2 years ago by marlowsd@…

commit 03d360f289a1c7e93fedf8cfa274cbe5929cd32c

Author: Simon Marlow <[email protected]>
Date:   Mon Jan 7 12:26:29 2013 +0000

    Fix bugs in allocMoreStack (#7498, #7510)
    There were four bugs here.  Clearly I didn't test this enough to
    expose the bugs - it appeared to work on x86/Linux, but completely by
    accident it seems.
    1. the delta was wrong by a factor of the slot size (as noted on #7498)
    2. we weren't correctly aligning the stack pointer (sp needs to be
    16-byte aligned on x86/x86_64)
    3. we were doing the adjustment multiple times in the case of a block
    that was both a return point and a local branch target.  To fix this I
    had to add new shim blocks to adjust the stack pointer, and retarget
    the original branches.  See comment for details.
    4. we were doing the adjustment for CALL instructions, which is
    unnecessary and wrong; only JMPs should be preceded by a stack
    (Someone with a PPC box will need to update the PPC version of
    allocMoreStack to fix the above bugs, using the x86 version as a

 compiler/nativeGen/AsmCodeGen.lhs |   10 ++--
 compiler/nativeGen/PPC/Instr.hs   |    7 +-
 compiler/nativeGen/X86/Instr.hs   |  124 +++++++++++++++++++++++++++----------
 3 files changed, 99 insertions(+), 42 deletions(-)

comment:14 Changed 2 years ago by simonmar

Could someone on Windows please test with this patch and close the ticket if all is well? Thanks.

comment:15 Changed 2 years ago by igloo

  • Resolution set to fixed
  • Status changed from new to closed

Looks good here, thanks!

comment:16 Changed 2 years ago by pho@…

commit c1feb5f9b82ab05a128ecb7678d2da3db078ff40

Author: PHO <[email protected]>
Date:   Mon Feb 11 13:49:26 2013 +0900

    Fix bugs in PPC.Instr.allocMoreStack (#7498)
    This patch is ported from #7510, which fixes the same bug in the x86 nativeGen.

 compiler/nativeGen/PPC/Instr.hs |  124 ++++++++++++++++++++++++++------------
 1 files changed, 85 insertions(+), 39 deletions(-)
Note: See TracTickets for help on using tickets.