Thread (49 messages) 49 messages, 14 authors, 2026-02-06

Re: [RFC v1] man/man2/close.2: CAVEATS: Document divergence from POSIX.1-2024

From: Rich Felker <dalias@libc.org>
Date: 2026-01-20 17:47:01
Also in: linux-fsdevel

On Tue, Jan 20, 2026 at 12:05:52PM -0500, Zack Weinberg wrote:
quoted
On Fri, May 23, 2025 at 02:10:57PM -0400, Zack Weinberg wrote:
quoted
    close() always succeeds.  That is, after it returns, _fd_ has
    always been disconnected from the open file it formerly referred
    to, and its number can be recycled to refer to some other file.
    Furthermore, if _fd_ was the last reference to the underlying
    open file description, the resources associated with the open file
    description will always have been scheduled to be released.
...
quoted
quoted
    EINPROGRESS
    EINTR
           There are no delayed errors to report, but the kernel is
           still doing some clean-up work in the background.  This
           situation should be treated the same as if close() had
           returned zero.  Do not retry the close(), and do not report
           an error to the user.
Since this behavior for EINTR is non-conforming (and even prior to the
POSIX 2024 update, it was contrary to the general semantics for EINTR,
that no non-ignoreable side-effects have taken place), it should be
noted that it's Linux/glibc-specific.
I am prepared to take your word for it that POSIX says this is
non-conforming, but in that case, POSIX is wrong, and I will not be
convinced otherwise by any argument.  Operations that release a
resource must always succeed.
There are two conflicting requirements here:

1. Operations that release a resource must always succeed.
2. Failure with EINTR must not not have side effects.

The right conclusion is that operations that release resources must
not be able to fail with EINTR. And that's how POSIX should have
resolved the situation -- by getting rid of support for the silly
legacy synchronous-tape-drive-rewinding behavior of close on some
systems, and requiring close to succeed immediately with no waiting
for anything. But abandoning requirement 2 is not an option,
especially in light of the relationship between EINTR and thread
cancellation in regards to contract about side effects.

It's perfectly reasonable for implementations (as musl does, and as I
think glibc either does or intends to do) to just go all the way and
satisfy both 1 and 2 by having close translate the kernel EINTR into
0.
Now, the abstract correct behavior is secondary to the fact that we
know there are both systems where close should not be retried after
EINTR (Linux) and systems where the fd is still open after EINTR
(HP-UX).  But it is my position that *portable code* should assume the
Linux behavior, because that is the safest option.  If you assume the
HP-UX behavior on a machine that implements the Linux behavior, you
might close some unrelated file out from under yourself (probably but
not necessarily a different thread).  If you assume the Linux behavior
on a machine that implements the HP-UX behavior, you have leaked a
file descriptor; the worst things that can do are much less severe.
Unfortunately, regardless of what happens, code portable to old
systems needs to avoid getting in the situation to begin with. By
either not installing interrupting signal handlers or blocking EINTR
around close.
The only way to get it right all the time is to have a big long list
of #ifdefs for every Unix under the sun, and we don't even have the
data we would need to write that list.
quoted
While I agree with all of this, I think the tone is way too
proscriptive. The man pages are to document the behaviors, not tell
people how to program.
I could be persuaded to tone it down a little but in this case I think
the man page's job *is* to tell people how to program.  We know lots of
existing code has gotten the fine details of close() wrong and we are
trying to document how to do it right.
No, the job of the man pages absolutely is not "to tell people how to
program". It's to document behaviors. They are not a programming
tutorial. They are not polemic diatribes. They are unbiased statements
of facts. Facts of what the standards say and what implementations do,
that equip programmers with the knowledge they need to make their own
informed decisions, rather than blindly following what someone who
thinks they know better told them to do.
quoted
Aside: the reason EINTR *has to* be specified this way is that pthread
cancellation is aligned with EINTR. If EINTR were defined to have
closed the fd, then acting on cancellation during close would also
have closed the fd, but the cancellation handler would have no way to
distinguish this, leading to a situation where you're forced to either
leak fds or introduce a double-close vuln.
The correct way to address this would be to make close() not be a
cancellation point.
This would also be a desirable change, one I would support if other
implementors are on-board with pushing for it.
quoted
An outline of what I'd like to see instead:

- Clear explanation of why double-close is a serious bug that must
  always be avoided. (I think we all agree on this.)

- Statement that the historical Linux/glibc behavior and current POSIX
  requirement differ, without language that tries to paint the POSIX
  behavior as a HP-UX bug/quirk. Possibly citing real sources/history
  of the issue (Austin Group tracker items 529, 614; maybe others).

- Consequence of just assuming the Linux behavior (fd leaks on
  conforming systems).

- Consequences of assuming the POSIX behavior (double-close vulns on
  GNU/Linux, maybe others).

- Survey of methods for avoiding the problem (ways to preclude EINTR,
  possibly ways to infer behavior, etc).
This outline seems more or less reasonable to me but, if it's me
writing the text, I _will_ characterize what POSIX currently says
about EINTR returns from close() as a bug in POSIX.  As far as I'm
concerned, that is a fact, not polemic.

I have found that arguing with you in particular, Rich, is generally
not worth the effort.  Therefore, unless you reply and _accept_ that
the final version of the close manpage will say that POSIX is buggy,
I am not going to write another version of this text, nor will I be
drawn into further debate.
I will not accept that because it's a gross violation of the
responsibility of document writing.

Rich
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help