Thread (3 messages) 3 messages, 3 authors, 2025-06-26

Re: [PATCH v3 3/4] daemon: use sigaction() to install child_handler()

From: Carlo Marcelo Arenas Belón <hidden>
Date: 2025-06-26 16:36:04

On Thu, Jun 26, 2025 at 08:33:29AM -0800, Junio C Hamano wrote:
Phillip Wood [off-list ref] writes:
quoted
Only because you have chosen to use SA_RESTART here. I think it would
be better to drop that and instead say something like


POSIX leaves several aspects of the behavior of signal() up to the
implementer. It is unspecified whether long running syscalls are
restarted after an interrupt is received.  The event loop in "git
daemon" assumes that poll() will return with EINTR if a child exits
while it is polling but if poll() is restarted after an interrupt is
received that does not happen which can lead to a build up of
uncollected zombie processes.

It is also unspecified whether the handler is reset after an interrupt
is received. In order to work correctly on operating systems such as
Solaris where the handler for SIGCLD is reset when a signal is
received "git daemon" calls signal() from the SIGCLD handler to
re-establish a handler for that signal. Unfortunately this causes
infinite recursion on other operating systems such as AIX.

Both of these problems can be addressed by using sigaction() instead
of signal() which has much stricter semantics and so, by setting the
appropriate flags, we can rely on poll() being interrupted and the
SIGCLD handler not being reset. This change means that all long
running system calls could fail with EINTR not just pall() but rest of
the event loop code is designed to gracefully handle that.

After this change there is still a race where a child that exits after
it has been checked in check_dead_children() but before we call poll()
will not be collected until a new connection is received or a child
exits while we're polling. We could fix this by using the "self-pipe
trick" but do not because ....


Then we can drop patches 1 and 4.
Thanks for a suggestion.  I really like the "everybody in these code
paths are prepared to receive and handle EINTR just fine, so we do
not have to do the SA_RESTART" observation very much.
Except that is unlikely to be true, as the code has changed a lot on
those 20 years (including that it now uses run_command) and that we
had been effectively using SA_RESTART under the covers for most of
that time because signal() behaviour changed. An obvious bug:
(ex: 20250626161038.85966-1-carenas@gmail.com)

I think using SA_RESTART by default might be safer, specially considering
that patch 4 already exist and it is not that complicated now that we
have 2.

Carlo
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help