Opened 7 months ago

Last modified 3 weeks ago

#15382 new bug

heapprof001 segfaults in prof_hc_hb way on i386

Reported by: bgamari Owned by:
Priority: normal Milestone: 8.10.1
Component: Compiler Version: 8.4.3
Keywords: Cc: simonmar, RyanGlScott
Operating System: Unknown/Multiple Architecture: x86
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #15463 Differential Rev(s):
Wiki Page:

Description (last modified by bgamari)

I am seeing the heapprof001 testcase segmentation fault in the prof_hc_hb testsuite way on i386. For instance, https://circleci.com/gh/ghc/ghc/7104:

Wrong exit code for heapprof001(prof_hc_hb)(expected 0 , actual 139 )
Stdout ( heapprof001 ):
a <= 
a <= 
a <= 
a <= 
a <= 
a <= 
a <=
Stderr ( heapprof001 ):
Segmentation fault
*** unexpected failure for heapprof001(prof_hc_hb)

Change History (9)

comment:1 Changed 7 months ago by bgamari

Description: modified (diff)

comment:2 Changed 7 months ago by bgamari

Milestone: 8.6.18.8.1

These won't be fixed for in GHC 8.6.

comment:3 Changed 2 months ago by bgamari

This also seems to fail periodically on amd64.

comment:4 Changed 2 months ago by Ben Gamari <ben@…>

In 8fd3f9a6/ghc:

testsuite: Mark heapprof001 as broken in prof_hc_hb way on i386

As documented in #15382, this is known to fail in prof_hc_hb on i386.
Concerningly, I have also seen this test non-deterministically fail in
prof_hc_hb on amd64. We should really investigate this.

comment:5 Changed 2 months ago by bgamari

I have been able to reproduce this issue under rr. The backtrace looks like,

LDV_recordDead (c=c@entry=0x42000a60c8, size=4, size@entry=6) at rts/ProfHeap.c:205
205	                    ctr->c.ldv.void_total += size;
>>> bt
#0  LDV_recordDead (c=c@entry=0x42000a60c8, size=4, size@entry=6) at rts/ProfHeap.c:205
#1  0x0000000000a790f4 in processHeapClosureForDead (c=0x42000a60c8) at rts/LdvProfile.c:124
#2  processNurseryForDead () at rts/LdvProfile.c:192
#3  LdvCensusForDead (N=1) at rts/LdvProfile.c:236
#4  0x0000000000a8174b in GarbageCollect (collect_gen=collect_gen@entry=1, do_heap_census=do_heap_census@entry=false, gc_type=gc_type@entry=0, cap=cap@entry=0x12afd80 <MainCapability>, idle_cap=idle_cap@entry=0x0) at rts/sm/GC.c:461
#5  0x0000000000a6ff55 in scheduleDoGC (pcap=pcap@entry=0x7ffc87622758, force_major=force_major@entry=true, task=0x1e0d660) at rts/Schedule.c:1806
#6  0x0000000000a71110 in exitScheduler (wait_foreign=wait_foreign@entry=false) at rts/Schedule.c:2663
#7  0x0000000000a7cb12 in hs_exit_ (wait_foreign=false) at rts/RtsStartup.c:392
#8  0x0000000000a7cfc5 in shutdownHaskellAndExit (n=0, fastExit=fastExit@entry=0) at rts/RtsStartup.c:553
#9  0x0000000000a6dde1 in hs_main (argc=<optimized out>, argv=<optimized out>, main_closure=<optimized out>, rts_config=...) at rts/RtsMain.c:99
#10 0x0000000000410fd3 in main ()

ctr is NULL (which there happens to be an ASSERT to check for, but the testcase isn't compiled against the debug RTS).

comment:6 Changed 2 months ago by bgamari

Cc: simonmar added

comment:7 Changed 8 weeks ago by Ben Gamari <ben@…>

In 9b65ae69/ghc:

rts: Turn ASSERT in LDV_recordDead into a normal if

As reported in #15382 the `ASSERT(ctr != NULL)` is currently getting routinely
hit during testsuite runs. While this is certainly a bug I would far prefer
getting a proper error message than a segmentation fault. Consequently I'm
turning the `ASSERT` into a proper `if` so we get a proper error in non-debug
builds.

comment:8 Changed 8 weeks ago by osa1

Milestone: 8.8.18.10.1

Bumping milestones of low-priority tickets.

comment:9 Changed 3 weeks ago by RyanGlScott

Cc: RyanGlScott added

cc'ing since I'm also seeing this failure on validate-x86_64-linux-fedora27 (here)

Note: See TracTickets for help on using tickets.