waitForProcess and getProcessExitCode are unsafe against asynchronous exceptions

In this description of the current behavior of waitForProcess, assume for simplicity the following:

The process being waited on has already terminated.
Neither waitForProcess nor getProcessExitCode has previously been called for this ProcessHandle.
waitpid() returns success on the first try; it does not get interrupted by a signal.

Under these assumptions, waitForProcess currently behaves as follows:

It is passed a ProcessHandle named ph. ProcessHandle is defined like so:

data ProcessHandle__ = OpenHandle PHANDLE | ClosedHandle ExitCode
newtype ProcessHandle = ProcessHandle (MVar ProcessHandle__)

It allocate a CInt using alloca; the pointer to it is named pret.
It passes pret to c_waitForProcess. c_waitForProcess makes a system call to waitpid(), and populates pret with the result.
waitForProcess peeks into pret and then mutates ph, changing it from an OpenHandle to a ClosedHandle.

There is already a comment in the source code mentioning that this approach is unsafe:

        -- don't hold the MVar while we call c_waitForProcess...
        -- (XXX but there's a small race window here during which another
        -- thread could close the handle or call waitForProcess)

, which is correct. However, it is also unsafe against asynchronous exceptions, and would remain so even if the MVar were held during the c_waitForProcess call. If an asynchronous exception occurs in between steps 3 and 4, then the system will be left in a state in which the child process has been successfully waited on by the OS, but the ProcessHandle is still in an OpenHandle state and the exit code has been lost. A subsequent call to waitForProcess will result in waitpid() unexpectedly returning ECHILD, or worse, the OS will have recycled the child process's PID and waitpid() will wait on the wrong process.

getProcessExitCode has the same bug.

I propose fixing this by redefining ProcessHandle like so:

data ProcessHandle = ProcessHandle PHANDLE (Ptr CInt) (MVar (Ptr CInt))

, where the second argument maybe points to the process's return code, and the third argument contains a pointer to a boolean flag indicating whether the second argument points to something meaningful. Extend the signature of c_waitForProcess to take both pointers. Then, let c_waitForProcess provide the following contract. If called as c_waitForProcess ph pret pflag:

*pflag is required to be false at the start of the call.
If and only if the call to waitpid() returns successfully, then *pflag will be set to true and *pret will be populated with the exit code.

Then, waitForProcess can behave as follows:

The entire routine is wrapped in withMVar. Asynchronous exceptions are allowed to occur.
If *pflag is true, then construct an ExitCode from *pret and return it.
Otherwise, call c_waitForProcess (with the same retry behavior as presently), and then construct and return an ExitCode after a successful return.

And getProcessExitCode can:

Mask exceptions.
tryTakeMVar. If the MVar is already held, unmask exceptions and return Nothing.
Peek at *pflag.
Replace the MVar.
Unmask exceptions.
If *pflag is true, then construct and return an ExitCode from *pret.
Otherwise, return Nothing.

Trac metadata

Trac field	Value
Version	7.6.3
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	libraries/process
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

waitForProcess and getProcessExitCode are unsafe against asynchronous exceptions

Child items ...

Activity