Opened 3 years ago

Closed 3 years ago

#7303 closed bug (fixed)

RTS : Race condition with usage of timer_delete

Reported by: erikd Owned by:
Priority: normal Milestone: 7.6.2
Component: Compiler Version: 7.7
Keywords: Cc:
Operating System: Linux Architecture: Unknown/Multiple
Type of failure: Building GHC failed Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


Using Qemu (CPU emulator) to build GHC for Arm, the build ocassionally dies with either a segfault just before exit or hangs just before exit (ie the output files are written and seem complete, but GHC does not terminate cleanly causing the build to halt).

While debugging POSIX timer support in Qemu I wrote a little test program for POSIX timers and found that under Qemu, about 1 time in 10, the signal related to the POSIX timer gets delivered *after* the timer is deleted and causes either a segfault or a hang. While I have only seen this happen for sure with Qemu/ARM emulation, the asynchronous nature of signals suggests that this is theoretically possible in real (ie non-emulated) systems.

Already have a tested patch to fix this.

Attachments (1)

0001-rts-Ignore-signal-before-deleting-timer.-Fixes-7303.patch (1.1 KB) - added by erikd 3 years ago.

Download all attachments as: .zip

Change History (7)

comment:1 Changed 3 years ago by erikd

The patch has been validated on x86-64 linux and powerpc-linux. It also means the Qemu/Arm build works much better (there is still at least one bug in Qemu though).

@simonmar : If you'd like, I can commit this directly, but since I was messing with the RTS I thought I better run this past you first.

comment:2 Changed 3 years ago by erikd

  • Status changed from new to patch

comment:3 Changed 3 years ago by erikd@…

commit 5f3c1055c2a5a59117985420909dd9148d7b2ba6

Author: Erik de Castro Lopo <[email protected]>
Date:   Sat Oct 6 17:23:01 2012 +1000

    rts: Ignore signal before deleting timer. Fixes #7303.
    Was getting an ocassional hang or segfault when building GHC in a
    Qemu user space emulation of ARM. Turned out that the ITIMER_SIGNAL
    was being delivered *after* the call to timer_delete(). Setting the
    signal to SIG_IGN before deleting the timer solves the problem.

 rts/posix/Itimer.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

comment:4 Changed 3 years ago by erikd

  • Status changed from patch to merge

comment:5 Changed 3 years ago by igloo

  • difficulty set to Unknown
  • Milestone set to 7.6.2

comment:6 Changed 3 years ago by igloo

  • Resolution set to fixed
  • Status changed from merge to closed
Note: See TracTickets for help on using tickets.