Opened 6 years ago

Closed 9 months ago

#2451 closed task (fixed)

New signal-handling API

Reported by: simonmar Owned by: igloo
Priority: highest Milestone: 7.8.1
Component: libraries/unix Version: 6.8.3
Keywords: Cc: pho@…, bos@…, tibbe, anton.nik@…, phunge0@…, leuschner@…, william.knop.nospam@…, lewurm@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description (last modified by simonmar)

The current API for handling signals in System.Posix is lacking in a couple of ways:

  • it isn't modular: there's no way for a library to install a private signal handler, there is only a singla global handler per signal.
  • it isn't possible to get hold of the information from siginfo_t (#592).

I want to propose a new API. This has already undergone a round of changes after discussion with Duncan Coutts, and we ended up with something quite simple. Haddock docs are here:

http://www.haskell.org/~simonmar/unix/System-Posix-Signals.html#4

(also attached as unix-html.tar.gz).

I have working patches to implement this. The old API is still available, and is reimplemented using the new API. Signal handlers installed using the old API will coexist with those installed using the new API.

The main motivation for this change was so that I could hook the SIGCHLD signal in the System.Process library without disturbing users of the System.Posix library who might also want to install a SIGCHLD handler.

Deadline: 1 Aug (2 weeks)

Attachments (1)

unix-html.tar.gz (121.9 KB) - added by simonmar 6 years ago.
Haddock docs for the unix package with new signal API

Download all attachments as: .zip

Change History (38)

Changed 6 years ago by simonmar

Haddock docs for the unix package with new signal API

comment:1 Changed 6 years ago by igloo

  • Milestone set to Not GHC

comment:2 Changed 6 years ago by simonmar

  • Architecture changed from Unknown to Unknown/Multiple

comment:3 Changed 6 years ago by simonmar

  • Operating System changed from Unknown to Unknown/Multiple

comment:4 Changed 5 years ago by simonmar

Update: the internals of the rewrite have now been committed, so far with no API changes.

Thu Feb 19 02:05:32 PST 2009  Simon Marlow <marlowsd@gmail.com>
  * Rewrite of signal-handling.

comment:5 Changed 5 years ago by golubovsky

The change of Feb 19 (namely in System.Posix.Process.Internals) does not work with Hugs as GHC.Conc is not known to Hugs. As a result, the whole Unix package is skipped while building. Please consider this in future updates.

comment:6 Changed 5 years ago by igloo

  • Milestone changed from Not GHC to 6.12 branch
  • Priority changed from normal to high

We should fix this, although it's not immediately clear to me how at first glance. Perhaps move Signal to a portable module? Perhaps even System.Posix.Process.Internals?

comment:7 Changed 5 years ago by igloo

  • Milestone changed from 6.12 branch to 6.12.1

comment:8 Changed 5 years ago by simonmar

  • Owner set to simonmar

comment:9 Changed 5 years ago by simonmar

  • Milestone changed from 6.12.1 to 6.14 branch

No time before 6.12.1.

comment:10 Changed 5 years ago by simonmar

See also #592 (expose siginfo_t to signal handlers)

comment:11 Changed 4 years ago by PHO

  • Cc pho@… added

comment:12 Changed 4 years ago by simonmar

  • Description modified (diff)
  • Type of failure set to None/Unknown

Fixed URL to haddock docs in description.

comment:13 Changed 4 years ago by bos

  • Cc bos@… added

comment:14 Changed 4 years ago by tibbe

  • Cc tibbe added

comment:15 Changed 4 years ago by igloo

  • Milestone changed from 6.14 branch to 6.14.1

comment:16 Changed 4 years ago by lelf

  • Cc anton.nik@… added

comment:17 follow-up: Changed 4 years ago by phunge0

  • Cc phunge0@… added

Given that the motivation was to share SIGCHLD among several code
paths, I have a few questions:

Suppose both System.Process and some other code has forked children.
What should they do inside their signal handler? Typically they'd want
to call wait(), but that might reap a child which had been forked by the
other API, and then the other handler would never see that its child had
exited. So doesn't the handling of SIGCHLD (i.e. the calling of wait())
have to be a global thing?

The only thing I can think of would be for Process's signal handler to
loop over each child it had forked on it's own, and call waitpid()
instead of wait(), but that's sort of gross. The users of System.Posix
would need to do the same.

Another question: in the docs you posted, you suggest installing a no-op
signal handler to ignore a signal. That's good for 99% of programs, but
some programs really need SIG_IGN instead of a no-op handler. The
reason: signals which have a handler are reset to SIG_DFL after an
exec(), but signals which are SIG_IGN remain ignored. One program which
needs this is nohup. Seems like
SIG_IGN is deprecated in the new signal API?

comment:18 in reply to: ↑ 17 ; follow-up: Changed 4 years ago by simonmar

Replying to phunge0:

Given that the motivation was to share SIGCHLD among several code
paths, I have a few questions:

Suppose both System.Process and some other code has forked children.
What should they do inside their signal handler? Typically they'd want
to call wait(), but that might reap a child which had been forked by the
other API, and then the other handler would never see that its child had
exited. So doesn't the handling of SIGCHLD (i.e. the calling of wait())
have to be a global thing?

No - the SIGCHLD handler gets passed the pid of the process that died, so it can decide whether this process belongs to someone else, and if not reap it using waitpid().

Another question: in the docs you posted, you suggest installing a no-op
signal handler to ignore a signal. That's good for 99% of programs, but
some programs really need SIG_IGN instead of a no-op handler. The
reason: signals which have a handler are reset to SIG_DFL after an
exec(), but signals which are SIG_IGN remain ignored. One program which
needs this is nohup. Seems like
SIG_IGN is deprecated in the new signal API?

Yes, this is something I'll need to look into.

Right now the new API is on ice, as I've been busy with other things. The main remaining problem is that signal delivery isn't reliable enough in the currently implementation, because the pipe used to communicate with the IO manager thread can fill up and signals can be dropped, so we need some way to handle that.

comment:19 in reply to: ↑ 18 ; follow-up: Changed 4 years ago by phunge0

Replying to simonmar:

Replying to phunge0:

Given that the motivation was to share SIGCHLD among several code
paths, I have a few questions:

Suppose both System.Process and some other code has forked children.
What should they do inside their signal handler? Typically they'd want
to call wait(), but that might reap a child which had been forked by the
other API, and then the other handler would never see that its child had
exited. So doesn't the handling of SIGCHLD (i.e. the calling of wait())
have to be a global thing?

No - the SIGCHLD handler gets passed the pid of the process that died, so it can decide whether this process belongs to someone else, and if not reap it using waitpid().

Really? My (vague) understanding is that a single SIGCHLD might be generated even though multiple children have exited -- i.e. SIGCHLD will not be queued the way real-time signals are. I see that w/ a test program which blocks SIGCHILD w/ sigprocmask -- 2 children exit, only one signal handler gets invoked.

comment:20 in reply to: ↑ 19 Changed 4 years ago by simonmar

Replying to phunge0:

Really? My (vague) understanding is that a single SIGCHLD might be generated even though multiple children have exited -- i.e. SIGCHLD will not be queued the way real-time signals are. I see that w/ a test program which blocks SIGCHILD w/ sigprocmask -- 2 children exit, only one signal handler gets invoked.

Hmm, if that's the case then we're in even worse shape than I thought, and this approach to implementing System.Process really won't work at all. Is there any way to tell that SIGCHLD signals have been dropped?

comment:21 follow-up: Changed 4 years ago by duncan

Perhaps we can just use SIGCHLD as a notification. When we get a SIGCHLD we can assume that there may be multiple children in a waitable state. We can then call waitpid/waitid multiple times, using the WNOHANG flag so it does not block, then we call it repeatedly until there are no more children to wait for. If we want to avoid waiting for children that do not belong to System.Process then we can use WNOWAIT first time and then call it again if we find the PID was one our ours.

comment:22 Changed 4 years ago by duncan

See also the docs for wait(), particular the rationale near the bottom for a discussion of reliably counting child processes.

comment:24 in reply to: ↑ 21 ; follow-ups: Changed 4 years ago by phunge0

Replying to duncan:

And similarly: http://www.linuxselfhelp.com/gnu/glibc/html_chapter/libc_24.html#SEC491

This is the way I've seen it implemented before and as far as I'm aware is the only non-racy way to get all the exit statuses. We might be able to define an API which combines the nitty-gritty of handling SIGCHLD & waiting into a single operation so to hide this complexity from users. That starts to depart from mimicing POSIX APIs though.

Replying to duncan:

If we want to avoid waiting for children that do not belong to System.Process then we can use WNOWAIT first time and then call it again if we find the PID was one our ours.

Suppose the second waitable process belongs to System.Process, but the first waitable process belongs to something else?

Suppose we call the normal waitpid() instead, and if we find a PID which isn't ours, we stash the status in a temp buffer, to be returned the next time someone else calls System.Posix.getAnyProcessStatus?

Replying to simonmar:

Right now the new API is on ice, as I've been busy with other things. The main remaining problem is that signal delivery isn't reliable enough in the currently implementation, because the pipe used to communicate with the IO manager thread can fill up and signals can be dropped, so we need some way to handle that.

This is tough to fix within a C signal handler! The # of times a signal handler might be invoked before the IO manager thread get to run is unbounded, so you need potentially unbounded memory. But the list of things you're allowed to do inside a signal handler is pretty short (ref: http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html); even memory allocation is verboten, so finding a place to buffer that information somewhere is AFAIK next to impossible.

One suggestion: just block all the signals that you need to handle, and spawn a thread which waits for them using sigwaitinfo (more detail: http://www.linuxjournal.com/article/2121?page=0,1)? Since you're now in thread i.s.o. signal context, you can do whatever you want. On newer linux you could also wait directly in the IO manager using signalfd(2).

This only would work for process-directed signals, not thread-directed signals, but given GHC's runtime model it seems like all the signals that users must handle would be process-directed anyway.

comment:25 in reply to: ↑ 24 Changed 4 years ago by phunge0

Replying to phunge0:

Replying to duncan:

If we want to avoid waiting for children that do not belong to System.Process then we can use WNOWAIT first time and then call it again if we find the PID was one our ours.

Suppose the second waitable process belongs to System.Process, but the first waitable process belongs to something else?

Suppose we call the normal waitpid() instead, and if we find a PID which isn't ours, we stash the status in a temp buffer, to be returned the next time someone else calls System.Posix.getAnyProcessStatus?

Wow, bad wording, let me try to clarify :) The first paragraph is describing a potential problem with what Duncan's proposing, and the second paragraph should read: "As an alternative, how about calling the normal waitpid() instead, ..."

comment:26 in reply to: ↑ 24 ; follow-up: Changed 4 years ago by duncan

Replying to phunge0:

Suppose the second waitable process belongs to System.Process, but the first waitable process belongs to something else?

You're quite right. I had assumed that we could iterate through the list of waitable child processes, but it looks like if we do not reap the first then we never get shown the second.

Suppose we call the normal waitpid() instead, and if we find a PID which isn't ours, we stash the status in a temp buffer, to be returned the next time someone else calls System.Posix.getAnyProcessStatus?

We must not reap processes that do not belong to us, it's not just the Haskell API, it's other C libs in the same process as us. In the worst case we could just call waitpid(pid, WNOHANG) to poll all the pids we are managing. That's a bit ugly when we're managing a lot of processes.

comment:27 in reply to: ↑ 24 Changed 4 years ago by simonmar

Replying to phunge0:

One suggestion: just block all the signals that you need to handle, and spawn a thread which waits for them using sigwaitinfo (more detail: http://www.linuxjournal.com/article/2121?page=0,1)? Since you're now in thread i.s.o. signal context, you can do whatever you want. On newer linux you could also wait directly in the IO manager using signalfd(2).

Yes, we discussed doing this in the new I/O manager, but the sticking point is the bit where you say "just block all the signals". We can't guarantee to do this when new threads might be created by C libraries, or the Haskell code might itself be a library used by a C client. sigwaitinfo and signalfd are designed to be used by self-contained processes, not libraries.

comment:28 in reply to: ↑ 26 Changed 4 years ago by phunge0

Replying to duncan:

Replying to phunge0:

Suppose we call the normal waitpid() instead, and if we find a PID which isn't ours, we stash the status in a temp buffer, to be returned the next time someone else calls System.Posix.getAnyProcessStatus?

We must not reap processes that do not belong to us, it's not just the Haskell API, it's other C libs in the same process as us.

Yep, good point.

Replying to [comment 27: simonmar]:

Yes, we discussed doing this in the new I/O manager, but the sticking point is the bit where you say "just block all the signals". We can't guarantee to do this when new threads might be created by C libraries, or the Haskell code might itself be a library used by a C client. sigwaitinfo and signalfd are designed to be used by self-contained processes, not libraries.

Good point. In any of the "invisible" threads, the signal would not be blocked, and thus the signal handler would get invoked, which we wouldn't want. But, could we install a signal handler which noted the signal, and then blocked the signal from then on? All threads would eventually accumulate the correct signal masks, even the ones we don't know about (NB: sigprocmask is signal-safe, I'm assuming pthread_sigmask would be too).

This is gross, but so are signals ;) You'd still need to communicate over something like a pipe for the first time that a signal is invoked, before we had the chance to mask it off; but this'd be a small number of invocations, potentially signals_with_handlers * n_threads.

comment:29 Changed 4 years ago by simonmar

  • Milestone changed from 7.0.1 to _|_
  • Priority changed from high to normal

Dropping this to _|_ until I have a chance to work on it.

comment:30 Changed 4 years ago by dleuschner

  • Cc leuschner@… added

comment:31 Changed 3 years ago by igloo

  • Type changed from proposal to task

comment:32 Changed 3 years ago by altaic

  • Cc william.knop.nospam@… added

comment:33 Changed 2 years ago by lewurm

  • Cc lewurm@… added

comment:34 Changed 18 months ago by igloo

  • Milestone changed from _|_ to 7.8.1
  • Priority changed from normal to highest

Before releasing 7.8.1, we should check whether the exports in

commit 2ab2b4951e4525f34bc32876d40bc3f8fe9fe12d
Author: Ian Lynagh <ian@well-typed.com>
Date:   Wed Oct 31 15:12:28 2012 +0000

    Export CatchInfo,CatchInfoOnce constructors of Handler

are correct.

comment:35 Changed 18 months ago by simonmar

I think this is ok, but I'm surprised you didn't also have to export SignalInfo(..) and SignalSpecificInfo(..) (we'll definitely need these too, otherwise CatchInfo and CatchInfoOnce aren't very useful).

comment:36 Changed 9 months ago by simonpj

  • Owner changed from simonmar to igloo

Ian, Simon responded to your qn, and this is highest... can you finalise? Thanks

Simon

comment:37 Changed 9 months ago by igloo

  • Resolution set to fixed
  • Status changed from new to closed

Fixed by:

commit 46bfe3d56a2c7732bb6222f3e9ad6ad7a94e13d7

Author: Ian Lynagh <ian@well-typed.com>
Date:   Sun Jul 21 21:09:23 2013 +0100

    Export SignalInfo(..), SignalSpecificInfo(..); completes #2451
Note: See TracTickets for help on using tickets.