Segmentation faults in profiled way
It seems that profiling has regressed sometime between GHC 8.0 and 8.2. A few times since September I have noticed that profiled programs (in particular, GHC itself built with profiling enabled) seem to segmentation fault.
This most recent case was produced by building commit 6ebfbdfb with the following in build.mk
,
BuildFlavour = prof
define add_mods_flag =
$(foreach mod,$(2),$(eval $(basename $(mod))_HC_OPTS += $(1)))
endef
$(call add_mods_flag,-fprof-auto,$(wildcard compiler/typecheck/*.hs))
STRIP_CMD = :
and using the resulting stage2 compiler to bootstrap the same commit. Eventually the build will fail with a segmentation fault. Unfortunately it seems the crash isn't entirely reproducible.
Trac metadata
Trac field | Value |
---|---|
Version | 8.1 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | highest |
Resolution | Unresolved |
Component | Compiler |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |
- Show closed items
Relates to
- #133878.2.17
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Ben Gamari mentioned in issue #5654 (closed)
mentioned in issue #5654 (closed)
- Ben Gamari mentioned in issue #13387 (closed)
mentioned in issue #13387 (closed)
- Ben Gamari changed milestone to %8.2.1
changed milestone to %8.2.1
- Ben Gamari changed weight to 10
changed weight to 10
- Ben Gamari added Tbug Trac import labels
added Tbug Trac import labels
- Ben Gamari assigned to @bgamari
assigned to @bgamari
- Author Maintainer
Looking through history since 8.0, #5654 (closed) seems relevant.
- Author Maintainer
mpickering has reported this as #13387 (closed). The repro case on that ticket crashes reliably.
- Author Maintainer
Here is some gdb output from a crash,
$ ~/ghc-utils/debug-ghc ~/ghc/roots/8.2-profiled/bin/ghc -v -O2 Main.hs -fforce-recomp +RTS -p gdb --args /home/ben/ghc/roots/8.2-profiled/lib/ghc-8.2.0.20170315/bin/ghc -B/home/ben/ghc/roots/8.2-profiled/lib/ghc-8.2.0.20170315 -v -O2 Main.hs -fforce-recomp +RTS -p GNU gdb (Debian 7.12-4) 7.12 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /home/ben/ghc/roots/8.2-profiled/lib/ghc-8.2.0.20170315/bin/ghc...run done. (gdb) run Starting program: /mnt/work/ghc/roots/8.2-profiled/lib/ghc-8.2.0.20170315/bin/ghc -B/home/ben/ghc/roots/8.2-profiled/lib/ghc-8.2.0.20170315 -v -O2 Main.hs -fforce-recomp +RTS -p [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff6853700 (LWP 19991)] [New Thread 0x7ffff6052700 (LWP 19992)] [New Thread 0x7ffff5851700 (LWP 19993)] [New Thread 0x7ffff5050700 (LWP 19994)] Glasgow Haskell Compiler, Version 8.2.0.20170315, stage 2 booted by GHC version 8.0.2 Using binary package database: /home/ben/ghc/roots/8.2-profiled/lib/ghc-8.2.0.20170315/package.conf.d/package.cache There is no package.cache in /home/ben/.ghc/x86_64-linux-8.2.0.20170315/package.conf.d, checking if the database is empty There are no .conf files in /home/ben/.ghc/x86_64-linux-8.2.0.20170315/package.conf.d, treating package database as empty package flags [] loading package database /home/ben/ghc/roots/8.2-profiled/lib/ghc-8.2.0.20170315/package.conf.d loading package database /home/ben/.ghc/x86_64-linux-8.2.0.20170315/package.conf.d wired-in package ghc-prim mapped to ghc-prim-0.5.0.0 wired-in package integer-gmp mapped to integer-gmp-1.0.0.1 wired-in package base mapped to base-4.10.0.0 wired-in package rts mapped to rts wired-in package template-haskell mapped to template-haskell-2.12.0.0 wired-in package ghc mapped to ghc-8.2.0.20170315 wired-in package dph-seq not found. wired-in package dph-par not found. package flags [] loading package database /home/ben/ghc/roots/8.2-profiled/lib/ghc-8.2.0.20170315/package.conf.d loading package database /home/ben/.ghc/x86_64-linux-8.2.0.20170315/package.conf.d wired-in package ghc-prim mapped to ghc-prim-0.5.0.0 wired-in package integer-gmp mapped to integer-gmp-1.0.0.1 wired-in package base mapped to base-4.10.0.0 wired-in package rts mapped to rts-1.0 wired-in package template-haskell mapped to template-haskell-2.12.0.0 wired-in package ghc mapped to ghc-8.2.0.20170315 wired-in package dph-seq not found. wired-in package dph-par not found. *** Chasing dependencies: Chasing modules from: *Main.hs !!! Chasing dependencies: finished in 0.94 milliseconds, allocated 0.503 megabytes Stable obj: [] Stable BCO: [] Ready for upsweep [NONREC ModSummary { ms_hs_date = 2017-03-07 08:43:42 UTC ms_mod = Main, ms_textual_imps = [(Nothing, Prelude)] ms_srcimps = [] }] *** Deleting temp files: Deleting: compile: input file Main.hs *** Checking old interface for Main (use -ddump-hi-diffs for more details): [1 of 1] Compiling Main ( Main.hs, Main.o ) *** Parser [Main]: !!! Parser [Main]: finished in 73.04 milliseconds, allocated 54.502 megabytes *** Renamer/typechecker [Main]: !!! Renamer/typechecker [Main]: finished in 572.66 milliseconds, allocated 398.957 megabytes *** Desugar [Main]: Result size of Desugar (after optimization) = {terms: 11,856, types: 8,892, coercions: 0, joins: 0/0} !!! Desugar [Main]: finished in 178.87 milliseconds, allocated 132.900 megabytes *** Simplifier [Main]: Thread 1 "ghc" received signal SIGSEGV, Segmentation fault. 0x00000000000002e1 in ?? () (gdb) bt #0 0x00000000000002e1 in ?? () #1 0x0000000000000000 in ?? () (gdb) info reg rax 0x64a8ec0 105549504 rbx 0x64a8ec0 105549504 rcx 0x64a8ec0 105549504 rdx 0x420acd6000 283649073152 rsi 0x420acd6fff 283649077247 rdi 0x54c96c8 88905416 rbp 0x420b84fbc0 0x420b84fbc0 rsp 0x7fffffff9fc8 0x7fffffff9fc8 r8 0x1 1 r9 0x420b84fc40 283661106240 r10 0x8 8 r11 0x420b84ffd0 283661107152 r12 0x420acd5ff8 283649073144 r13 0x64b0718 105580312 r14 0x64ad160 105566560 r15 0x420b8480d0 283661074640 rip 0x2e1 0x2e1 eflags 0x10207 [ CF PF IF RF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 (gdb) x/32a 0x420b84fbc0 0x420b84fbc0: 0x54cb638 <stg_sel_6_upd_info+184> 0x64a8ec0 <CCS_DONT_CARE> 0x420b84fbd0: 0x54c98d8 <stg_upd_frame_info> 0x42018bde00 0x420b84fbe0: 0x0 0x6245070 0x420b84fbf0: 0x54d2590 <stg_restore_cccs_eval_info> 0x42018bde00 0x420b84fc00: 0x54c99c0 <stg_marked_upd_frame_info> 0x42018bde00 0x420b84fc10: 0x0 0x42095d8000 0x420b84fc20: 0x54d2590 <stg_restore_cccs_eval_info> 0x42018bde00 0x420b84fc30: 0x54c99c0 <stg_marked_upd_frame_info> 0x42018bde00 0x420b84fc40: 0x0 0x42095d8020 0x420b84fc50: 0x54d2590 <stg_restore_cccs_eval_info> 0x42018bde00 0x420b84fc60: 0x54c99c0 <stg_marked_upd_frame_info> 0x42018bde00 0x420b84fc70: 0x0 0x42095d8040 0x420b84fc80: 0x54d2590 <stg_restore_cccs_eval_info> 0x42018bde00 0x420b84fc90: 0x54c99c0 <stg_marked_upd_frame_info> 0x42018bde00 0x420b84fca0: 0x0 0x42095d8060 0x420b84fcb0: 0x54d2590 <stg_restore_cccs_eval_info> 0x42018bde00
- Author Maintainer
- Developer
That is really strange! Inlining doesn't affect semantics, and anything that passes Lint should not seg-fault. So it may have tickled the bug but it seems hard to believe that it's the cause.
- Author Maintainer
Quick update: At this point I have determined that the issue is the fix to #5654 (closed). Ultimately it seems like we are ending up with a
stg_sel_5_upd
being invoked on aSimplEnv.FloatFlag
, which is not a single-constructor record (it is a enumeration). Naturally, things go terribly awry. Still trying to work out exactly how we get into this situation. - Developer
Replying to [ticket:13433#comment:133707 bgamari]:
Quick update: At this point I have determined that the issue is the fix to #5654 (closed). Ultimately it seems like we are ending up with a
stg_sel_5_upd
being invoked on aSimplEnv.FloatFlag
, which is not a single-constructor record (it is a enumeration). Naturally, things go terribly awry. Still trying to work out exactly how we get into this situation.How can the fix to a closed ticket fix a new one? I'm missing something.
- Author Maintainer
To put it another way, the fix to #5654 (closed) caused this regression. Reverting 3a18baff, 2a02040b, and 394231b3 fixes the crash.
- Maintainer
- Simon Marlow assigned to @simonmar and unassigned @bgamari
- Developer
I still haven't been able to repro this. I used exactly the
build.mk
above, and I've built all of nofib withmake NoFibRuns=0 EXTRA_HC_OPTS="+RTS -p -RTS"
without a single segfault, and I have a pile of
.prof
files.This is Linux/x86_64, my tree is master @ bf3952ed. Is there anything that might be different about my environment compared to yours that might account for this?
- Developer
I'll try building exactly from 6ebfbdfb as in the description.
Also presumably your
build.mk
also has this:ifneq "$(BuildFlavour)" "" include mk/flavours/$(BuildFlavour).mk endif
otherwise
BuildFlavour
has no effect, right? - Author Maintainer
Yes, I should have been more specific: essentially I appended the cited snippet to
build.mk
.Very odd that you have been unable to reproduce this. I'm looking in to what might differ in our environments.
- Author Maintainer
Alright, I have once again reproduced this. Unfortunately I realized that you actually need to cherry-pick a few patches on top of 6ebfbd as it doesn't build on its own. One of these patches fixes a silly typo. The other is my rather crude fix to #13233 (closed) (Phab:D3063) ensuring we don't attempt to tick string literals. I'm a bit suspicious of the latter, but the build doesn't build any lint warnings so I've been operating under the assumption that it's safe.
Without further ado, here is a full repro, {{{ #!/bin/bash -e
git clone git://git.haskell.org/ghc --recursive ghc-T13433 cd ghc-T13433 git checkout 6ebfbdfb git cherry-pick e4620dc7 22519050 git submodule update
cat >mk/build.mk <<'EOF' BuildFlavour = prof
ifneq "$(BuildFlavour)" ""
include mk/flavours/$(BuildFlavour).mk
endif
GhcStage2HcOpts += -dcore-lint -dcmm-lint define add_mods_flag =
(foreachmod,(2),$(eval $(basename $(mod))_HC_OPTS += $(1)))endef
(calladdmodsflag,−fprof−auto,(wildcard compiler/typecheck/*.hs))STRIP_CMD = : EOF
- /boot
- /configure
make -j8
wget https://ghc.haskell.org/trac/ghc/raw-attachment/ticket/13387/Main.hs inplace/bin/ghc-stage2 -O2 -fforce-recomp Main.hs +RTS -p }}}
- Developer
Trac metadata
Trac field Value Differential revisions - → D3386 - Author Maintainer
Yay Simon!
- Simon Marlow mentioned in commit 074d13eb
mentioned in commit 074d13eb
- Ben Gamari closed
closed
- Author Maintainer
Merged to
ghc-8.2
as bdcb0c85.Trac metadata
Trac field Value Resolution Unresolved → ResolvedFixed - Ben Gamari added Phighest label
added Phighest label