Opened 11 years ago

Last modified 15 months ago

#2401 new bug

aborting an STM transaction should throw an exception

Reported by: sclv Owned by:
Priority: low Milestone:
Component: Runtime System Version: 6.8.3
Keywords: Cc: rob@…, simonmar, akio
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Incorrect result at runtime Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

The attached test case will hang at +RTS -N2 and above. As noted in the other ticket I submitted (http://hackage.haskell.org/trac/ghc/ticket/2398), this behavior happens on an Core 2 Quad with 64 bit architecture, and does not take place on a PowerPC with 32 bit architecture and no multicore.

Attachments (1)

teststmhangs.hs (384 bytes) - added by sclv 11 years ago.

Download all attachments as: .zip

Change History (22)

Changed 11 years ago by sclv

Attachment: teststmhangs.hs added

comment:1 Changed 11 years ago by sclv

Oh yes, with the unsafeIOToSTM line commented out, it does not hang, and if STM is replaced by standard concurrency constructs it does not hang.

comment:2 Changed 11 years ago by sclv

Component: CompilerRuntime System

comment:3 Changed 11 years ago by igloo

difficulty: Unknown
Milestone: 6.10.1

Thanks for the report; I don't /think/ that the unsafeIOToSTM should be to blame, although I'm not too familiar with STM.

comment:4 Changed 11 years ago by simonmar

Resolution: wontfix
Status: newclosed

I understand what's happening here. Inside a transaction you're doing this:

unsafeIOToSTM $ putStrLn ".."

Now, putStrLn needs to take a lock on stdin. But the STM implementation assumes it can abort a transaction at any time, and in this case it aborts a transaction in the middle of putStrLn (because it discovered the transaction was already invalid). When it aborts the transaction, the exception handler in putStrLn is discarded (STM didn't expect there to be a real IO exception handler in the middle of a transaction). So stdin remains locked, and the program eventually deadlocks.

Perhaps we could fix this by throwing a real exception, but there's quite a large design space there, and in any case when using unsafeIOToSTM you're really on your own. I'll put a big warning in the docs for unsafeIOToSTM though.

comment:5 Changed 11 years ago by sclv

severity: majornormal
Summary: unsafeIOToSTM hangs with 64 bit multicore.unsafeIOToSTM discards exception handlers.

(changing the ticket name to reflect the real issue)

I'm not happy with this for a number of reasons. Primarily because I imagine the same effect could be achieved if, e.g., STM was operating on an ostensibly pure data structure that hid an exception handler inside via, e.g., unsafeInterleaveIO. I'll think more about when/how this could come up using, e.g., only the standard libs, but I'm fairly sure I can manufacture a case.

Second, I think unsafeIOToSTM should be as well supported as unsafePerformIO in that one can break invariants with it if one is not careful, but that it can be used "properly" to introduce new and useful features. As it stands, unsafeIOToSTM with this issue is too unsafe to be used even for debugging, much less for any real code.

Finally, because the semantics of STM as described in the papers don't run into this problem, as the mechanism there only checks for consistency on transaction commits, not at other arbitrary intervals as well (i.e. on GC).

I tend to think that, although it obviously requires discussion, throwing a real exception would make sense, especially since atomic rollback uses the same stack unwinding mechanism in any case, as I understand it.

But, at a minimum, what would be nice is an RTS flag that disables the consistency check on GC, and thus sidesteps this whole issue. Furthermore, since there's a cost-benefit tradeoff to this check in any case, it would be nice to let users tweak that setting and play with it.

comment:6 Changed 11 years ago by sclv

It also occurs to me that there's an easier general solution that throwing retry as a real exception -- a flag could be introduced somewhat like block, except called, e.g., blockRetry, of type STM a -> STM a. Semantics would generally be the same as with block. blockRetry could wrap every call to unsafeIOToSTM, for example, and might prove useful elsewhere in user-generated code.

comment:7 Changed 10 years ago by simonmar

Architecture: UnknownUnknown/Multiple

comment:8 Changed 10 years ago by simonmar

Operating System: UnknownUnknown/Multiple

comment:9 Changed 9 years ago by sclv

Resolution: wontfix
Status: closedreopened
Type of failure: None/Unknown

As per this discussion (http://www.haskell.org/pipermail/glasgow-haskell-users/2008-September/015412.html) I believe this ticket should have been reopened.

Getting it right is, however, contingent on this ticket as well: http://hackage.haskell.org/trac/ghc/ticket/2558

comment:10 Changed 9 years ago by simonmar

Milestone: 6.10.16.14.1
Summary: unsafeIOToSTM discards exception handlers.aborting an STM transaction should throw an exception
Type of failure: None/UnknownIncorrect result at runtime

(changing the bug title to reflect the underlying problem)

Summary:

  • the STM transaction might be in the middle of an unsafePerformIO when it is aborted. The user has no control over this, since transactions can be aborted by the RTS, and the use of unsafePerformIO might be in a library somewhere.
  • The IO in the library expects to be able to catch exceptions and clean up if it is interrupted, otherwise it might leave locks in place. Imagine pulling on some lazy I/O during an STM transaction, for example.

As @sclv pointed out, we also need to fix

#2558
re-throwing an asynchronous exception throws it synchronously

otherwise when the IO operation re-throws the abort exception, it will throw it synchronously.

comment:11 Changed 8 years ago by igloo

Milestone: 7.0.17.0.2

comment:12 Changed 8 years ago by igloo

Milestone: 7.0.27.2.1

comment:13 Changed 7 years ago by igloo

Milestone: 7.2.17.4.1

comment:14 Changed 7 years ago by igloo

Milestone: 7.4.17.6.1
Priority: normallow

comment:15 Changed 6 years ago by igloo

Milestone: 7.6.17.6.2

comment:16 Changed 5 years ago by rleslie

Cc: rob@… simonmar added

As someone who has recently been bitten by this while trying to implement a pure interface to an external library via FFI, I would like to express my enthusiasm for its resolution.

comment:17 Changed 5 years ago by thoughtpolice

Milestone: 7.6.27.10.1

Moving to 7.10.1.

comment:18 Changed 4 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:19 Changed 3 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:20 Changed 3 years ago by thomie

Milestone: 8.0.1

comment:21 Changed 15 months ago by akio

Cc: akio added
Note: See TracTickets for help on using tickets.