Opened 17 months ago
Last modified 5 months ago
#14329 infoneeded bug
GHC 8.2.1 segfaults while bootstrapping master
Reported by: | bgamari | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Compiler | Version: | 8.2.1 |
Keywords: | Cc: | ||
Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
Type of failure: | None/Unknown | Test Case: | |
Blocked By: | Blocking: | ||
Related Tickets: | #12960, #9065, #7762 | Differential Rev(s): | Phab:D4075 |
Wiki Page: |
Description
Earlier this week the Linux/amd64 Harbormaster started failing somewhat reliably during validation. It seems the stage0 compiler (GHC 8.2.1) often fails with a segmentation fault. This seems to have started with ef26182e2014b0a2a029ae466a4b121bf235e4e4 although I suspect this isn't causal. I was able to capture a core dump of the crashing stage0 compiler which implicates the allocator,
Reading symbols from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/ghc...(no debugging symbols found)...done. [New LWP 25151] [New LWP 25160] [New LWP 25156] [New LWP 25158] [New LWP 25157] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/opt/ghc/8.2.1/lib/ghc-8.2.1/bin/ghc -B/opt/ghc/8.2.1/lib/ghc-8.2.1 -hisuf hi -'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f836aaa2c90 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so [Current thread is 1 (Thread 0x7f83711b5340 (LWP 25151))] (gdb) bt #0 0x00007f836aaa2c90 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #1 0x00007f836aaa3211 in allocGroupOnNode () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #2 0x00007f836aa9dd41 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #3 0x00007f836aa9deb9 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #4 0x00007f836aa82a39 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #5 0x00007f836aa7fc06 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #6 0x00007f836aa9d461 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #7 0x00007f836aaa423a in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #8 0x00007f836aaa4b3c in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #9 0x00007f836aa8bbc8 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #10 0x00007f836aa8c912 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #11 0x00007f836aa8da01 in scheduleWaitThread () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #12 0x00007f836aa99fae in hs_main () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/libHSrts_thr-ghc8.2.1.so #13 0x0000000000427038 in ?? () #14 0x00007f83694fd2b1 in __libc_start_main (main=0x426fd0, argc=119, argv=0x7fffcfee8078, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffcfee8068) at ../csu/libc-start.c:291 #15 0x0000000000427069 in ?? ()
Change History (10)
comment:1 Changed 17 months ago by
Differential Rev(s): | → Phab:D4075 |
---|---|
Status: | new → patch |
comment:3 Changed 16 months ago by
Resolution: | → fixed |
---|---|
Status: | patch → closed |
Merged to ghc-8.2
as a69fa5441c944d7f74c76bdae9f3dd198007ee42.
comment:4 Changed 13 months ago by
Resolution: | fixed |
---|---|
Status: | closed → new |
It looks like the issue fixed in comment:2 isn't the only problem. We are still seeing segmentation faults on Harbormaster due to out-of-memory conditions. For instance,
(gdb) run Starting program: /home/ben/ghc/inplace/lib/bin/ghc-stage1 -B/home/ben/ghc/inplace/lib -hisuf hi -osuf o -hcsuf hc -static -O0 -H64m -Wall -fllvm-fill-undef-with-garbage -Werror -Iincludes -Iincludes/dist -Iincludes/dist-derivedconstants/header -Iincludes/dist-ghcconstants/header -this-unit-id ghc-8.5 -hide-all-packages -i -icompiler/backpack -icompiler/basicTypes -icompiler/cmm -icompiler/codeGen -icompiler/coreSyn -icompiler/deSugar -icompiler/ghci -icompiler/hsSyn -icompiler/iface -icompiler/llvmGen -icompiler/main -icompiler/nativeGen -icompiler/parser -icompiler/prelude -icompiler/profiling -icompiler/rename -icompiler/simplCore -icompiler/simplStg -icompiler/specialise -icompiler/stgSyn -icompiler/stranal -icompiler/typecheck -icompiler/types -icompiler/utils -icompiler/vectorise -icompiler/stage2/build -Icompiler/stage2/build -icompiler/stage2/build/./autogen -Icompiler/stage2/build/./autogen -Icompiler/. -Icompiler/parser -Icompiler/utils -Icompiler/../rts/dist/build -Icompiler/stage2 -optP-DGHCI -optP-include -optPcompiler/stage2/build/./autogen/cabal_macros.h -package-id base-4.11.0.0 -package-id deepseq-1.4.3.0 -package-id directory-1.3.1.5 -package-id process-1.6.2.0 -package-id bytestring-0.10.8.2 -package-id binary-0.8.5.1 -package-id time-1.8.0.2 -package-id containers-0.5.10.2 -package-id array-0.5.2.0 -package-id filepath-1.4.1.2 -package-id template-haskell-2.13.0.0 -package-id hpc-0.6.0.3 -package-id transformers-0.5.5.0 -package-id ghc-boot-8.5 -package-id ghc-boot-th-8.5 -package-id ghci-8.5 -package-id unix-2.7.2.2 -package-id terminfo-0.4.1.1 -Wall -Wno-name-shadowing -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -Wnoncanonical-monoid-instances -this-unit-id ghc -XHaskell2010 -XNoImplicitPrelude -optc-DTHREADED_RTS -DGHCI_TABLES_NEXT_TO_CODE -DSTAGE=2 -Rghc-timing -O -dcore-lint -dno-debug-output -Wcpp-undef -no-user-package-db -rtsopts -Wnoncanonical-monad-instances -odir compiler/stage2/build -hidir compiler/stage2/build -stubdir compiler/stage2/build -dynamic-too -c compiler/types/OptCoercion.hs -o compiler/stage2/build/OptCoercion.o -dyno compiler/stage2/build/OptCoercion.dyn_o -fforce-recomp [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. ---Type <return> to continue, or q <return> to quit--- 0x0000000002fcfe40 in alloc_mega_group () (gdb) bt #0 0x0000000002fcfe40 in alloc_mega_group () #1 0x0000000002fd038d in allocGroupOnNode () #2 0x0000000002fe3dff in alloc_todo_block () #3 0x0000000002fe3f56 in todo_block_full () #4 0x0000000000406497 in evacuate () #5 0x00000000004074ec in scavenge_block () #6 0x0000000002fe3726 in scavenge_loop () #7 0x0000000002fd0ed8 in GarbageCollect () #8 0x0000000002fc5eeb in scheduleDoGC () #9 0x0000000002fc68ce in scheduleWaitThread () #10 0x0000000002fcf010 in hs_main () #11 0x0000000000422684 in main () (gdb)
while building 1cb12eae648c964c411f4c83730f3db05e409f48.
comment:5 Changed 13 months ago by
Related Tickets: | → #12960, #9065 |
---|
comment:6 Changed 13 months ago by
Related Tickets: | #12960, #9065 → #12960, #9065, #7762 |
---|
Unfortunately the issue only happens less than one in ten runs even under rather strong memory pressure.
comment:7 Changed 13 months ago by
comment:8 Changed 13 months ago by
The recent spate of Harbormaster crashes seemingly began with the merge of Phab:D4341. However, I've tried reverting this patch with no apparent effect.
comment:9 Changed 7 months ago by
Could be related to #14346 if segfaults of 8.2.1 came back with 8.4.
comment:10 Changed 5 months ago by
Milestone: | 8.2.2 |
---|---|
Priority: | highest → normal |
Status: | new → infoneeded |
Is this reproducible with newer GHCs?
I wonder if we are running out of memory; The builder has only 4GB of RAM and four vCPUs. I have seen GHC segfault due to OOM conditions in the past.
I took a look at the allocator and noticed that we never actually check whether commit was successful. I've fixed this in Phab:D4075.