Opened 4 years ago

Last modified 3 months ago

#7353 new bug

Make system IO interruptible on Windows

Reported by: joeyadams Owned by: refold
Priority: normal Milestone: 8.4.1
Component: Core Libraries Version: 7.6.1
Keywords: Cc: idhameed@…, bos@…, shelarcy@…, dagitj@…, leon.p.smith@…, the.dead.shall.rise@…, hvr, ekmett, Phyx-, core-libraries-committee@…, drkoster@…
Operating System: Windows Architecture: Unknown/Multiple
Type of failure: Incorrect result at runtime Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Currently, IO operations (connect, read, etc.) cannot be interrupted on Windows. Additionally, threadWaitRead and threadWaitWrite do not work on Windows (IIRC, they can't tell if the given file descriptor is a socket or not, so they assume it's not).

The reason is that GHC does not have an IO manager for Windows like it does for *nix. For Windows with -threaded, the network package uses blocking FFI calls, which cannot be interrupted by asynchronous exceptions. Thus, System.Timeout and killThread can't be used to time out and interrupt network IO.

I'm working on a program that needs to run on a Windows XP machine for long periods of time, interacting with a couple intermittent network services. This issue is making things very difficult for me, so I want to get it fixed. I do not need to support thousands of concurrent connections efficiently; I just need my program to run reliably.

What needs to happen for IO to be interruptible on Windows with -threaded ? I'm willing to put in the work, but there's a lot I'd have to learn about windows IO and GHC IO handling. One question to get started: are threadWaitRead/threadWaitWrite even possible to implement on Windows, or might Windows assign overlapping HANDLE numbers to sockets and regular files?

This issue spans both GHC and the network package, but I'd like to collect information about it in one place. Related issues:

Attachments (2)

timeout.hs (264 bytes) - added by joeyadams 4 years ago.
Simple example of hanging network IO; times out on Linux, but not on Windows.
0001-GHC.Windows-more-error-support-guards-system-error-s.patch (8.0 KB) - added by joeyadams 4 years ago.
Patch for base that extends GHC.Windows (mostly error code stuff)

Download all attachments as: .zip

Change History (37)

Changed 4 years ago by joeyadams

Attachment: timeout.hs added

Simple example of hanging network IO; times out on Linux, but not on Windows.

comment:1 Changed 4 years ago by simonmar

Component: Runtime Systemlibraries/base
difficulty: Unknown
Milestone: 7.8.1

There are a few options:

  • maybe foreign import ccall interruptible will work for the FFI calls in the network package. It causes CancelSynchronousIo() to be called when the thread is the target of an exception, but I have no idea whether this will actually work to interrupt the operation or not.
  • make a version of threadWaitRead that works for sockets. This is easy; see #5797. However, this won't necessarily cancel the operation when an exception is raised, you'll need to arrange that separately somehow.
  • Implement an IO manager for Windows. This is of course the best solution, but it's a lot of work. You'd need to bind all the appropriate APIs in the base package, maybe copying or moving bits of the Win32 package into base. Ideally instead of FDs in the IO library we would use Win32 HANDLEs, so there would need to be Win32 replacements for GHC.IO.FD and GHC.IO.Handle.FD. This would have the nice effect of eliminating both the mingw and msvcrt layers from the IO library on Win32. Then the IO manager can use Win32 overlapped I/O.

comment:2 Changed 4 years ago by joeyadams

maybe foreign import ccall interruptible will work for the FFI calls in the network package.

Unfortunately, this is only available on Windows Vista and up. My program has to run on Windows XP.

I looked into some potential approaches to wait for IO on Windows. Please do point out any errors in this assessment.

Completion ports

This would involve a manager thread that repeatedly calls GetQueuedCompletionStatus. To perform sends and receives, we would use calls like ConnectEx and WSARecv. Caveats:

  • IOCP doesn't provide a way to wait for socket readiness, as far as I can tell. This means threadWaitRead and IODevice.ready will have to be emulated by some other means.

On the other hand, it might be possible using zero-size reads/writes.

  • IO operations are sensitive to the calling thread. From the documentation of WSARecv:

Note All I/O initiated by a given thread is canceled when that thread exits. For overlapped sockets, pending asynchronous operations can fail if the thread is closed before the operations complete. See ExitThread for more information.

Thus, we'll probably need a manager that assigns I/O jobs to threads such that no thread has multiple pending jobs involving the same HANDLE.

select

We could have a thread call select to wait on sockets in bulk. Caveats:

  • select() is limited to 64 sockets, so we'd have to manage a pool of threads to wait for more sockets.
  • As far as I can tell, there is no way to interrupt select() except by giving it a short timeout, or by writing to a control socket to prod the IO manager (the GHC event manager does this). We can't create such a socket on Windows without making the program host on a system port.

I suppose we could repeat the select() every 0.1 seconds or so, but this would cause a lot of latency; each read and write would spend up to this long waiting for the IO manager.

A faster approach would be to have the caller do a blocking select for a short period of time. If that times out, then we use the IO manager. This keeps quick waits quick, and has little effect on longer waits.

WSAEventSelect

We could instead use WSAEventSelect and WaitForMultipleObjects, which provides more flexibility than select(), and lets us create our own event handle which we can use to interrupt the IO manager. Caveats:

  • WSAEventSelect sets the socket to non-blocking mode, and cancels any previous WSAEventSelect and WSAAsyncSelect calls on the same socket. This might clash with libraries.
  • WaitForMultipleObjects is also limited to 64 handles, so we'd have to manage a thread pool.

A plan

Here's a plan: implement an IO manager for Windows using WaitForMultipleObjects, which allows callers to wait on their own HANDLEs using a function like this:

registerHandle :: EventManager -> (HandleKey -> IO ()) -> HANDLE -> IO HandleKey

Using WSAEventSelect, we can implement the following on top:

evtRead, evtWrite, evtOOB, evtAccept, evtConnect, ... :: Event

registerSocket :: EventManager -> (SocketKey -> Event -> IO ()) -> Fd -> Event -> IO FdKey

This API is modeled off of GHC.Event.

Now we can implement an alternative to threadWaitRead and threadWaitWrite for Windows sockets:

waitSocket :: Event -> Fd -> IO ()

With this, it should be possible to update Network.Socket and GHC.IO.FD so blocking operations can be interrupted without leaking OS threads. Mission accomplished.

We could probably do better with IOCP, but I think that would be more complex.

comment:3 Changed 4 years ago by simonmar

I'm not an expert on Windows async I/O and I haven't read all the documentation thoroughly, so please take my opinions with a pinch of salt.

I'm personally not concerned about being able to implement threadWaitRead: I think of that as an internal API used to implement blocking I/O. We only have it because this API happens to work with select()/epoll(). If the underlying OS interfaces don't map well to threadWaitRead, then we should redesign the API at that level, rather than going to a lot of trouble to emulate threadWaitRead.

It looks like your plan would work, but I wonder whether we ought to go to IOCP instead. Regarding the two points you make about IOCP:

  • no way to wait for readiness: as I said, I don't think we need threadWaitRead, and IODevice.ready can be implemented another way (we already have fdReady for this)
  • This is slightly annoying. One solution is to make the I/O manager a bound thread, and hand off all I/O operations to that thread, then we just ensure that this thread never exits. Or if the hand-off to a bound thread is too expensive, we will need a separate mechanism to prevent worker threads that have outstanding I/O requests from being terminated - a simple refcounting scheme will probably work fine here.

comment:4 Changed 4 years ago by joeyadams

Good point. Indeed, hWaitForInput uses a blocking FFI call even on Unix. I think we should still provide a way to wait for sockets on Windows that actually works, but that can be a separate ticket.

Thanks for pointing that out. IOCP it is.

I should elaborate more on the thread sensitivity issue. Based on what I've read, overlapped IO is canceled on thread exit. Moreover, to cancel pending IO ourselves, we need to call CancelIo from the same OS thread that initiated the IO, to avoid canceling unrelated IO. The latter restriction can be avoided by using CancelIoEx, but it requires Windows Vista or later.

Therefore, we enforce this rule: for each OS thread, no more than one pending IO request per HANDLE. Thus, one thread may serve a large number of HANDLEs, but multiple concurrent requests on the same HANDLE will require separate threads.

The IO manager would provide the following wrapper:

-- | Do something that supports overlapped I/O, and wait for it to complete.
withIOCP :: HANDLE                    -- ^ Device to operate on.  Must be associated
                                      --   with the IO manager using 'registerHandle'.
         -> (Ptr OVERLAPPED -> IO ()) -- ^ Callback which starts the IO
         -> IO Completion

-- | Associate a device handle with the IO manager's completion port.
-- It will be unassociated automatically when the handle is closed.
registerHandle :: HANDLE -> IO ()

It does the following sequence:

  • Allocate and zero an OVERLAPPED structure, paired with an MVar wrapped in a StablePtr used for signaling completion.
  • Acquire an OS thread that isn't waiting for a completion on the same HANDLE. Both the callback and CancelIo need to be called from the same OS thread, and this thread must not go away until we're done.
  • Run the callback, passing it the new OVERLAPPED structure. On exception, free our OVERLAPPED and StablePtr.

Note that if the IO completes immediately (which can happen), the completion port will still receive a completion. Possible optimization: skip waiting for the IO manager and read the OVERLAPPED structure right away; I don't know if this is safe, though.

  • Wait for the IO manager to fill our MVar. On exception, issue CancelIo. We do not need to unregister the OVERLAPPED or free it, since CancelIo will cause a completion to be delivered, meaning we can free it from the IO manager.
  • Return the completion (either an error code, or the number of bytes transferred).

So how do we do the "acquire an OS thread" step? Well, one way would be to manage a thread pool using forkOS, and pass IO jobs to it. But that means handing every IO job off to an OS thread.

Suppose Control.Concurrent had a wrapper that lets us simply use the current thread, then use that same thread later on. The problem is, what if the current OS thread, after starting IO for one green thread, makes a blocking FFI call for another green thread? We'd have to wait for that to complete before doing our CancelIo.

I think the hand-off is unavoidable, unless:

  • The caller is already running a bound thread.

To implement both of these optimizations, we would need a couple extensions to Control.Concurrent:

  • Reference counting to keep an OS thread alive until we say we're done with it, e.g.:
withPinnedThread :: IO a -> IO (a, IO ())

This increments the current OS thread's reference count, runs the callback entirely within that thread, and returns an action that decrements the thread's reference count.

These would be nice to have, but the Windows IO manager should work just fine without them.

Thanks for the guidance, especially at this critical stage. I'll see if I can implement this. Thanks to Felix Martini's winio package, much of the work is already done.

comment:5 Changed 4 years ago by simonmar

Ah, well done for finding the winio package - I was looking for it earlier but couldn't remember what it was called or who the author was.

So I think your design involves a single IO manager thread and a pool of worker OS threads to start/cancel each IO request, right? While this would work, I think it not ideal, because you'll need one OS thread for each in-progress IO operation, since all of these OS threads are blocked in the (Ptr OVERLAPPED -> IO ()) function waiting for the result.

Perhaps instead make the IO manager itself a bound thread, and perform all the start and cancel operations on the IO manager thread. Then an IO request is performed by sending the IO manager a pair: (start, MVar Completion), where start is the operation to start the IO and the MVar signals completion. The calling thread would block on the MVar inside a catch, where the exception handler tells the IO manager to cancel the IO by passing it a cancel action.

I suppose in general you will need multiple IO manager threads, to avoid the problem with CancelIo cancelling all the pending IO associated with a given HANDLE. We expect it to be rare to have multiple outstanding IO requests on the same HANDLE though - maybe you could even make it an error.

comment:6 Changed 4 years ago by joeyadams

Here's my first shot at the I/O manager: https://github.com/joeyadams/haskell-iocp .

Replying to simonmar:

... because you'll need one OS thread for each in-progress IO operation, since all of these OS threads are blocked in the (Ptr OVERLAPPED -> IO ()) function waiting for the result.

Actually, we don't. We just need to make sure a HANDLE doesn't use the same OS thread for multiple simultaneous operations. Other than that, multiple HANDLEs may share a single OS thread. For example, this is okay:

  • Worker 1: pending recv for A, pending recv for B
  • Worker 2: pending send for A

But this is not okay:

  • Worker 1: pending recv for A, pending recv for B, pending send for A

In the second scenario, if we issue CancelIo on A, it will cancel both pending jobs, not just the one we want to cancel.

There is only one thread that is expected to block for extended periods of time, and that is the completion handler, which waits for completions by calling GetQueuedCompletionStatus. Starting and canceling overlapped I/O should not block.

We expect it to be rare to have multiple outstanding IO requests on the same HANDLE though - maybe you could even make it an error.

Actually, this can happen if the program needs to send a message while it is waiting to receive a message. I suppose we could have threads for different classes of operation (e.g. send, receive, control).

Instead, this is implemented with a nifty trick involving lazy IO. We create an infinite list of workers using interleaved IO, and have each IOCPHandle maintain its own free list of workers. Thus, handles share workers, and new workers are created automatically as needed.

One problem I'd like to address: callers currently can't use functions like alloca to allocate buffers for overlapped I/O. Moreover, hPutBuf would have to copy the data into a temporary buffer.

This is because the OS owns the buffer until a completion arrives, even in the case of cancellation. Thus, the application thread should wait for CancelIo to complete (the implementation currently doesn't do this).

This should be easy to fix: on exception, tell the worker thread to CancelIo, and signal our MVar when it's done. Use uninterruptibleMask_ (takeMVar mv) to wait for the signal. This will also simplify the lifetime of completion objects and worker allocations.

But is it safe? It should be if my assumptions are correct:

  • Starting overlapped I/O and calling CancelIo do not block. We have to make this assumption anyway, if we are going to let handles share worker threads.
  • uninterruptibleMask_ (takeMVar mv) succeeds when another bound thread fills the MVar. That is, the bound thread won't be blocked due to our use of uninterruptibleMask_, as it is running from a different OS thread.
  • uninterruptibleMask is indeed not interruptible.

Are my assumptions about uninterruptibleMask true?

comment:7 Changed 4 years ago by joeyadams

I updated https://github.com/joeyadams/haskell-iocp to wait for cancellation using uninterruptibleMask. IOCP.Manager should be easier to follow now.

comment:8 Changed 4 years ago by joeyadams

My IOCP manager, https://github.com/joeyadams/haskell-iocp, is now fairly complete, and supports updatable timeouts like GHC.Event has. Where credit is due:

  • Felix Martini, author of the winio package. This code clarified how to use the completion port API, and gave me the idea to use StablePtr to carry context alongside the OVERLAPPED structure.
  • Authors of GHC.Event, for paving the way and identifying issues such as #3838 . The updatable timeout support in haskell-iocp is largely based on that from GHC.

Quick overview of the I/O manager design:

At the core is a thread which alternates between waiting for completions and calling expired timeout callbacks (much like GHC.Event does). Each completion carries a callback that the I/O manager thread simply executes.

registerTimeout sends a callback to the I/O manager telling it to update its state. This eliminates the need for an IORef to hold the timeout queue.

Overlapped I/O operations, on the other hand, do have to fight over an IORef. Because CancelIo does not let us specify the operation to cancel, we have to use separate worker threads if there are multiple concurrent operations on the same handle.

withOverlapped wraps system calls that support overlapped I/O, using the I/O manager to wait for completion. It basically works as follows:

  • Select a worker thread from which to initiate the I/O. Use the main I/O manager thread (same thread that waits for completions and handles timeouts) whenever it is available, to avoid an expensive context switch.
  • In this worker thread, allocate an OVERLAPPED with a callback that signals completion to the application thread. Call the StartCallback with this OVERLAPPED.
  • If withOverlapped is interrupted by an asynchronous exception, issue CancelIo from the same worker thread that started the I/O.
  • Do not let the application thread proceed until we are done operating on the handle.

Can we add this I/O manager to the base package? If so, what should the package be named? GHC.Event.IOCP ?

My goal is to make socket operations interruptible on Windows with -threaded. I don't have time to extend it to file I/O right now.

Changed 4 years ago by joeyadams

Patch for base that extends GHC.Windows (mostly error code stuff)

comment:9 Changed 4 years ago by joeyadams

Blocked By: 7415 added

comment:10 Changed 4 years ago by joeyadams

On Fri, Nov 16, 2012 at 5:34 AM, Simon Marlow wrote:

I suppose what I'm mainly concerned about is whether we're building in a requirement to do an OS-thread context switch for every IO request, which would be quite expensive. That seems to be part of the current design, but I admit I still don't fully understand the details. (I'll take another look at the code now).

You're right, each IO request does involve context switches. Here's how a typical IO request proceeds:

  • Application thread sends work to IO manager thread, then waits on an MVar
  • IO manager thread dequeues a completion with this work, and executes it.
  • IO manager thread later dequeues a completion signifying that the work completed, then fills the MVar so the application thread can proceed.

My IO manager runs in a bound thread, unlike the one in GHC.Event. This means there are two context switches per operation. On my system, each request takes about 30 microseconds (for comparison, a system call to gettimeofday takes about 1 microsecond).

GHC.Event will probably do better than my IO manager for lots of little sequential operations in a single thread. But for lots of IO running concurrently, my IO manager should have decent performance per operation, provided threads are scheduled like this:

  • Thread posts work to IO manager, and waits on MVar. RTS schedules another thread without switching OS threads. This thread also posts work to IO manager, and so on.
  • IO manager thread picks up several work requests and executes them, all without a context switch.
  • IO manager thread picks up several completions and fills the corresponding MVars, again without a context switch.
  • RTS reschedules application threads one by one.

If we want better performance for sequential operations, we'll have to do something more clever. One idea might be to introduce a new scheduling primitive:

-- | Allow the RTS to schedule unbound threads to the current
-- operating system thread for the given number of microseconds.
-- This may only be called from a bound thread.
donateTimeSlice :: Int -> IO ()

-- | Wake up a call to 'donateTimeSlice' issued by a bound thread.
endTimeSlice :: ThreadId -> IO ()

This way, when the IO manager detects no pending IO operations, it donates time to the thread pool, rather than blocking on GetQueuedCompletionStatus. When the application sends work to the IO manager, it calls endTimeSlice so the IO manager can wake up and check for completion packets.

Ideally, application code will spend most of its time in donated time slices, so the RTS can schedule the IO manager without context switching.

In any case, the IO manager needs control of what OS thread it runs on, since both GetQueuedCompletionStatus and overlapped I/O system calls are sensitive to the current thread.

One thing you do have to be careful of is that the RTS needs to be able to start and stop the IO manager itself; see win32/ThrIOManager.c. Perhaps you're not planning to integrate the existing IO manager with yours, but in that case we'll have two IO manager threads with different purposes - is your IO manager handling threadDelay?

Yes, my IO manager will handle threadDelay. The existing implementation in GHC.Conc.Windows works OK (can be interrupted), but does not scale well, at least in theory. It uses insertion sort to keep timeouts in order. My IO manager, like GHC.Event, uses a priority search queue (GHC.Event.PSQ) to track timeouts.

We might have to keep the old IO manager around to handle console events. I'm not sure if you can wait for that using overlapped I/O or not.

Stopping the IO manager might be problematic. According to MSDN, we can't close the completion port HANDLE until all references to it (HANDLEs associated with it) are closed first. Why does the RTS need to start and stop the IO manager? When does this happen?

I pushed a branch named windows-iocp to the base repo. It adds the new IO manager, and uses it to implement threadDelay and registerDelay.

comment:11 Changed 4 years ago by joeyadams

Blocked By: 7415 removed

I don't have time to work on this right now. It's more complicated than I expected:

  • Making the IOCP approach efficient would involve integration with the scheduler and a better understanding of how IOCP interacts with OS threads.
  • Couldn't switch to nonblocking sockets by modifying the network package alone, as GHC doesn't actually support custom IO devices. Some Handle methods cast the device to FD so they can read/write directly from/to the device; see #4144 .

comment:12 Changed 4 years ago by simonmar

@joeyadams: is your current implementation fast enough for your needs? Is there any way to make it available for people to try and/or work on separately, perhaps as a standalone package?

comment:13 Changed 4 years ago by ihameed

Cc: idhameed@… added

comment:14 Changed 4 years ago by bos

Cc: bos@… added

comment:15 Changed 4 years ago by shelarcy

Cc: shelarcy@… added

comment:16 Changed 4 years ago by dagit

Cc: dagitj@… added

comment:17 Changed 4 years ago by lpsmith

Cc: leon.p.smith@… added

comment:18 Changed 4 years ago by refold

Cc: the.dead.shall.rise@… added

comment:19 Changed 3 years ago by thoughtpolice

Milestone: 7.8.37.10.1

Moving to 7.10.1

comment:20 Changed 3 years ago by martin-bailey

Cc: hvr ekmett added

Is anyone currently working on an IO manager for Windows?

comment:21 Changed 2 years ago by thoughtpolice

Component: libraries/baseCore Libraries
Owner: set to ekmett

Moving over to new owning component 'Core Libraries'.

comment:22 Changed 2 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:23 Changed 2 years ago by Phyx-

Cc: Phyx- core-libraries-committee@… added

comment:24 Changed 19 months ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:25 Changed 19 months ago by thomie

Owner: ekmett deleted

comment:26 Changed 14 months ago by thomie

Milestone: 8.0.1

comment:27 Changed 12 months ago by refold

I will be spending the next two months working on this ticket, picking up where @joeyadams left off. My intention is to produce a patchset suitable for inclusion in GHC.

Last edited 12 months ago by refold (previous) (diff)

comment:28 Changed 11 months ago by Phyx-

Hi @refold,

Just checking to see if you're making any progress on this.

Cheers

comment:29 Changed 11 months ago by refold

@Phyx- So far I got the patches to work with GHC 8, will now focus on optimisation. Hopefully will have something ready for review by the end of this month.

comment:30 Changed 11 months ago by Phyx-

Owner: set to refold

@refold that sounds great, thanks for doing this! Let me know if you need any help.

In the mean time I'll assign the ticket to you.

comment:31 Changed 4 months ago by winter

Cc: drkoster@… added

comment:32 Changed 3 months ago by Phyx-

Hi @refold,

Just checking in to see if you're still working on this. If not would you be able to let me know your last status? This is on the list of major Windows changes I want to target for 8.4.

I assume this is your repository? https://github.com/23Skidoo/haskell-iocp

comment:33 Changed 3 months ago by Phyx-

Milestone: 8.4.1

comment:34 Changed 3 months ago by refold

@Phyx-

Hi,

Sorry for the lack of updates. Some parts of this are working, but in general my patches are not yet ready for inclusion in GHC and I haven't had time to work on them during the last six months. I do still want to finish this, but can't guarantee it will happen in time for 8.4. Anyone willing to help out is welcome to take a look at

and ask me any questions.

comment:35 Changed 3 months ago by Phyx-

Great, thanks @refold, I'll take a look and let you know. I'm certainly planning on trying to get it into the tree. We can just branch it off for now. I want to use this as a basis for the other I/O changes in GHC.

Would be a shame to let all this work go to waste :)

Note: See TracTickets for help on using tickets.