GHCi debugger panics when trying force a certain variable

changed weight to 5

added GHCi Tbug Trac import labels

Attached file Test.hs ($909).

The Test.hs file containing the definitions.

I can confirm this behavior on MacOS 10.8.4 (x86_64) and with HEAD as of 27 Aug.

Confirmed on Linux x86_64 with ghc-7.9.20141115.

mentioned in commit a61e717f

changed weight to 7

Trac metadata

Trac field	Value
Priority	normal → high

mentioned in issue #12383

Confirmed on 8.4.1 and HEAD (8.5.20180308).

added debugger label

I've been looking into this, here's what I found out so far:

Because we start evaluating foo before hitting the breakpoint, by the time the we return to the GHCi prompt foo points to a blackhole.
Once we stop at the breakpoint we do :print foo, pprintClosureCommand calls bindSuspensions with the id foo.
bindSuspensions invents a new name _t1 and binds it to the thunk that is foo, via RtClosureInspect.cvObtainTerm.
cvObtainTerm looks at the heap object pointed to by foo, which is a blackhole, and follows the indirectee pointer. It turns out the indirectee is a TSO object. At this point _t1 becomes bound to a TSO object, and evaluating it (e.g. with print _t1) causes this crash because TSO objects can't be entered.

I tried modifying cvObtainTerm so that it doesn't follow the indirectee pointer when it sees a blackhole. That way we bind _t1 to the blackhole object instead of the TSO object pointed by the indirectee field, but that caused a deadlock in the scheduler. I don't understand why yet.

simonmar, could you advise? Does the story make sense so far?

Simon, I implemented changes in cvObtainTerm as discussed yesterday, but I'm still getting "TSO object entered" errors.

Previously cvObtainTerm follwed indirectee's of BLACKHOLEs no matter what.

With my changes I only follow the indirectees when they're not TSO or BLOCKING_QUEUE. Somehow with this I still get "TSO object entered".

If I don't follow BLACKHOLE indirectees at all (and bind a BLACKHOLE to _t1 in this reproducer) then I get a deadlock as expected. Do you have any ideas on why this may be happening?

changed the description

assigned to @trac-Nolan

I simplified example. whnf function is not required to trigger this bug. You can also notice that context of execution (i.e. [main.hs:2:7-11]) prints twice which looks like a bug and probably deserves a ticket.

Thanks for the simplification. Which GHC version did you use to try this? I don't see the prompt printed twice on 8.4.1 and HEAD.

I use most current version from git repository(8.5.20180323). I've just built it.

mentioned in issue #14973 (closed)

Thanks, it turns out I have to use the default prompt to reproduce. Reported as #14973 (closed).

I created a differential with an implementation of the idea in ticket:8316#comment:150788. The patch seems to do the right thing but there's probably another bug in somewhere else so I'm still getting "TSO entered" errors.

Trac metadata

Trac field	Value
Differential revisions	- → D4535

mentioned in issue #15010

changed the description

Bug with context printed twice was fixed.

Ah, looking at this ticket again, I can see what I missed last time (in D4535).

The problem is D4535 does not have any effect because TSO and BLOCKING_QUEUE are already not handled by cvObtainTerm.go and cvObtainTerm returns a Suspension when it finds one of those objects. So even if we follow a BLACKHOLE that points to a TSO we return a Suspension. In D4535 we returned Suspension slightly earlier (before following the indirectee), but the value we returned was identical to the value we returned without the patch.

What we should do is if we see a BLACKHOLE pointing to an TSO or BLOCKING_QUEUE we should return a Suspension with the BLACKHOLE itself as the hval (currently: hval is the indirectee).

However I suspect entering the BLACKHOLE will result in a deadlock because the thread that's supposed to evaluate the expression (i.e. the owner) is blocked on an MVar (the breakpoint MVar passed to GHCi.Run.withBreakAction) and when we enter the BLACKHOLE our thread gets parked, to be unparked by the owner of the BLACKHOLE, which never happens as we don't update the MVar before entering the BLACKHOLE.

I submitted a diff. As expected, it causes a deadlock in the reproducer.

Simon, any ideas on how to fix the deadlock? Would it be possible to resume the evaluator thread (by updating breakMVar) before entering a BLACKHOLE?

Trac metadata

Trac field	Value
Differential revisions	D4535 → D4535, D5179

I don't see a good way to solve this. The thread that is evaluiating foo is stopped at a breakpoint - that's what the user asked for, so it's not entirely surprising that if they evaluate something that requires foo then it deadlocks.

What would we like to happen? I can think of a couple of alternatives:

*1. Just make it work**

Should it automatically continue evaluation of foo? How would you know when to do that? Evaluating a BLACKHOLE doesn't necessarily mean that we're about to deadlock, we might be evaluating something that another thread is evaluating. As soon as we release the breakMVar the thread will continue evaluating foo, but I don't know of a way to tell whether/when we should do that.

Perhaps instead of the MVar, a breakpoint should be an asynchronous exception so that we end up with a thunk that we could poke to continue evaluation? That would make this work, but it would mean a big change to the way breakpoints work and I'm not sure whether it would run into other problems. One potential problem is that it's a lot more expensive than the current breakpoint mechanism, so :trace wouldn't work so well.

*2. Make it an error of some kind**

e.g.

[main.hs:2:7-11] *Main> _t1
*** Exception: blocked on breakpoint 1

The question is how to achieve that. Perhaps we periodically monitor the thread we just created to do the evaluation and check whether it's blocked on a blackhole, and then compare the owner of the blackhole it is blocked on against all the threads we know are currently at breakpoints? That could possibly work, but it's tricky to implement.

Here's another example that deadlocks even with GHC 8.6:

foo = 0 : bar
bar = 1 : foo

in GHCi:

GHCi, version 8.6.1: http://www.haskell.org/ghc/  :? for help
:Loaded GHCi configuration from /home/omer/rcbackup/.ghci
[1 of 1] Compiling Main             ( test.hs, interpreted )
Ok, one module loaded.
λ:1> :break foo
Breakpoint 0 activated at test.hs:1:7-13
λ:2> foo
Stopped in Main.foo, test.hs:1:7-13
_result :: [Integer] = _
[test.hs:1:7-13] λ:3> :print bar
bar = (_t1::[Integer])
[test.hs:1:7-13] λ:4> _t1
[1

The reason why we don't get "TSO entered" here is because _t1 stands for bar, and bar itself is not locked by the evaluator thread. Instead an object in bar's payload is owned.

I think this shows that even if we could somehow release the MVar in the original reproducer there will be deadlocks.

I think we should:

Merge the patch. We should never enter a TSO or BLOCKING_QUEUE.
Document this behavior in the user manual
Disallow evaluating BLACKHOLEs in GHCi

The last step would fix the original reproducer, but my example above will still deadlock and that's what you get for having lazy evaluation.

changed title from GHCi debugger segfaults when trying force a certain variable to GHCi debugger panics when trying force a certain variable

changed the description

mentioned in commit 45ed4619

closed

changed milestone to %8.8.1

Trac metadata

Trac field	Value
Resolution	Unresolved → ResolvedFixed

Should this really be closed? I thought we want to do implement rest of the ideas in ticket:8316#comment:160624 too.

reopened

I left a comment drawing attention to this on the D5179 before merging but yes, you are right, we should wait to close this until all of these tasks are done.

Trac metadata

Trac field	Value
Resolution	ResolvedFixed → Unresolved

changed milestone to %⊥

changed weight to 5

The panic has been fixed, rest of ticket:8316#comment:160624 still need to be implemented, but they're not as urgent. Removing the milestone and marking the ticket as 'newcomer'.

Trac metadata

Trac field	Value
Priority	high → normal

added newcomer label and removed debugger label

added broken test label

added Pnormal label

GHCi debugger panics when trying force a certain variable

Child items 0

Activity

GHCi debugger panics when trying force a certain variable

Relates to

Activity