#14329 closed bug (fixed)

GHC 8.2.1 segfaults while bootstrapping master

Milestone: 8.2.2
Component: Compiler Version: 8.2.1
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Differential Rev(s): Phab:D4075
Earlier this week the Linux/amd64 Harbormaster started failing somewhat reliably during validation. It seems the stage0 compiler (GHC 8.2.1) often fails with a segmentation fault. This seems to have started with ef26182e2014b0a2a029ae466a4b121bf235e4e4 although I suspect this isn't causal. I was able to capture a core dump of the crashing stage0 compiler which implicates the allocator,

Reading symbols from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/ghc...(no debugging symbols found)...done.
[New LWP 25151]
[New LWP 25160]
[New LWP 25156]
[New LWP 25158]
[New LWP 25157]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/".
Core was generated by `/opt/ghc/8.2.1/lib/ghc-8.2.1/bin/ghc -B/opt/ghc/8.2.1/lib/ghc-8.2.1 -hisuf hi -'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f836aaa2c90 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
[Current thread is 1 (Thread 0x7f83711b5340 (LWP 25151))]
(gdb) bt
#0  0x00007f836aaa2c90 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#1  0x00007f836aaa3211 in allocGroupOnNode () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#2  0x00007f836aa9dd41 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#3  0x00007f836aa9deb9 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#4  0x00007f836aa82a39 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#5  0x00007f836aa7fc06 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#6  0x00007f836aa9d461 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#7  0x00007f836aaa423a in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#8  0x00007f836aaa4b3c in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#9  0x00007f836aa8bbc8 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#10 0x00007f836aa8c912 in ?? () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#11 0x00007f836aa8da01 in scheduleWaitThread () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#12 0x00007f836aa99fae in hs_main () from /opt/ghc/8.2.1/lib/ghc-8.2.1/bin/../rts/
#13 0x0000000000427038 in ?? ()
#14 0x00007f83694fd2b1 in __libc_start_main (main=0x426fd0, argc=119, argv=0x7fffcfee8078, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffcfee8068) at ../csu/libc-start.c:291
#15 0x0000000000427069 in ?? ()

I wonder if we are running out of memory; The builder has only 4GB of RAM and four vCPUs. I have seen GHC segfault due to OOM conditions in the past.

I took a look at the allocator and noticed that we never actually check whether commit was successful. I've fixed this in Phab:D4075.

rts/posix: Ensure that memory commit succeeds

Previously we wouldn't check that mmap would succeed. I suspect this may
have been the cause of #14329.

Test Plan: Validate under low-memory condition

Reviewers: simonmar, austin, erikd

Reviewed By: simonmar

Subscribers: rwbarton, thomie

GHC Trac Issues: #14329

Differential Revision:

