Opened 3 years ago

Closed 3 years ago

#11830 closed bug (fixed)

Disabling idle GC leads to freeze

Reported by: NeilMitchell Owned by:
Priority: highest Milestone: 8.2.1
Component: Runtime System Version: 8.0.1-rc3
Keywords: Cc: simonmar, ndmitchell@…, kolmodin
Operating System: Linux Architecture: Unknown/Multiple
Type of failure: Incorrect result at runtime Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D2129
Wiki Page:

Description

I'm currently getting a runtime freeze with a spinning CPU with the latest GHC 8.0.1 RC (8.0.0.20160411). Testing 2 months ago on whatever was the latest release candidate showed no problems. The reproduction steps are a bit long winded:

Observe that Shake fails to complete and starts spinning on 1 CPU.

If you modify shake.cabal to remove -with-rtsopts=-I0 -qg -qb then it works again and completes in < 1 min. Adding back flags with +RTS -I0 -RTS shows that -I0 alone is the culprit.

Attachments (1)

check_test_case.sh (1.5 KB) - added by kolmodin 3 years ago.
git bisect automation

Download all attachments as: .zip

Change History (11)

comment:1 Changed 3 years ago by NeilMitchell

Cc: ndmitchell@… added

comment:2 Changed 3 years ago by NeilMitchell

Setting -I1000, when the computation takes < 20s, still succeeds. That seems to imply that the idle setting of 0 is the problem. Adding -S has the last line not being a GC.

comment:3 Changed 3 years ago by NeilMitchell

Using GHC HEAD (but not the latest release candidate) I also get a freeze when doing shake --demo --keep-going).

Changed 3 years ago by kolmodin

Attachment: check_test_case.sh added

git bisect automation

comment:4 Changed 3 years ago by kolmodin

comment:5 Changed 3 years ago by kolmodin

Cc: kolmodin added

comment:6 Changed 3 years ago by simonmar

Milestone: 8.0.1
Priority: normalhighest

Release blocker

comment:7 Changed 3 years ago by bgamari

Differential Rev(s): Phab:D2129
Status: newpatch

comment:8 Changed 3 years ago by bgamari

Milestone: 8.0.18.2.1

The patch in question has been reverted on ghc-8.0 (see #10840).

comment:9 Changed 3 years ago by Ben Gamari <ben@…>

In 16a51a6c/ghc:

rts: Close livelock window due to rapid ticker enable/disable

This fixes #11830, where the RTS would livelock if run with `-I0` due
to a regression introduced by bbdc52f3a6e6a28e209fb8f65699121d4ef3a4e3.
The reason for this is that the new codepath introduced a subtle race
condition:

 1. one thread could request that the ticker stop and would block until
    the ticker in fact stopped
 2. meanwhile, another thread could sneak in and restart the ticker

this was implemented in such a way where thread (1) would end up
blocked forever. The solution here is to simply not block. The worst
that will happen is that timer fires again, but is ignored since the
ticker is stopped.

Test Plan:
Validate, try reproduction case in #11830. Need to find a nice
testcase.

Reviewers: simonmar, erikd, hsyl20, austin

Reviewed By: erikd, hsyl20

Subscribers: erikd, thomie

Differential Revision: https://phabricator.haskell.org/D2129

GHC Trac Issues: #11830

comment:10 Changed 3 years ago by bgamari

Resolution: fixed
Status: patchclosed

The commit in comment:9 isn't quite the whole story due to lacking synchronization. This is cleaned up in the rework of the itimer subsystem in 999c464da36e925bd4ffea34c94d3a7b3ab0135c (which also addresses #11965).

Note: See TracTickets for help on using tickets.