Thread (25 messages) 25 messages, 7 authors, 2019-03-30

Re: [PATCH 2/4] pid: add pidfd_open()

From: Daniel Colascione <hidden>
Date: 2019-03-30 07:39:20
Also in: lkml

On Fri, Mar 29, 2019 at 11:25 PM Jonathan Kowalski [off-list ref] wrote:
On Sat, Mar 30, 2019 at 5:35 AM Daniel Colascione [off-list ref] wrote:
quoted
On Thu, Mar 28, 2019 at 3:38 AM Christian Brauner [off-list ref] wrote:
quoted
quoted
All that said, thanks for the work on this once again. My intention is
just that we don't end up with an API that could have been done better
and be cleaner to use for potential users in the coming years.
Thanks for your input on all of this. I still don't find multiplexers in
the style of seccomp()/fsconfig()/keyctl() to be a problem since they
deal with a specific task. They are very much different from ioctl()s in
that regard. But since Joel, you, and Daniel found the pidctl() approach
not very nice I dropped it. The interface needs to be satisfactory for
all of us especially since Android and other system managers will be the
main consumers.
Thanks.
quoted
So let's split this into pidfd_open(pid_t pid, unsigned int flags) which
allows to cleanly get pidfds independent procfs and do the translation
to procpidfds in an ioctl() as we've discussed in prior threads. This
I sustain my objection to adding an ioctl. Compared to a system call,
an ioctl has a more rigid interface, greater susceptibility to
programmer error (due to the same ioctl control code potentially doing
different things for different file types), longer path length, and
more awkward filtering/monitoring/auditing/tracing. We've discussed
this issue at length before, and I thought we all agreed to use system
calls, not ioctl, for core kernel functionality. So why is an ioctl
suddenly back on the table? The way I see it, an ioctl has no
advantages except for 1) conserving system call numbers, which are not
scarce, and 2) avoiding the system call number coordination problem
(and the coordination problem isn't a factor for core kernel code). I
don't understand everyone's reluctance to add new system calls. What
am I missing? Why would we give up all the advantages that a system
call gives us?
I agree in general, but in this particular case a system call or an
ioctl doesn't matter much, all it does is take the pidfd, the command,
and /proc's dir fd.
Thanks again.

I agree that the operation we're discussing has a simple signature,
but signature flexibility isn't the only reason to prefer a system
call over an ioctl. There are other reasons for preferring system
calls to ioctls (safety, tracing, etc.) that apply even if the
operation we're discussing has a relatively simple signature: for
example, every system call has a distinct and convenient ftrace event,
but ioctls don't; strace filtering Just Works on a
system-call-by-system-call basis, but it doesn't for ioctls; and
documentation for system calls is much more discoverable (e.g., man
-k) than documentation for ioctls. Even if the distinction doesn't
matter much, IMHO, it still matters a little, enough to favor a system
call without an offsetting advantage for the ioctl option.
If you start adding a system call for every specific operation on file
descriptors, it *will* become a problem.
I'm not sure what you mean. Do you mean that adding a top-level system
call for every operation that might apply to one specific kind of file
descriptor would lead, as the overall result, to the kernel having
enough system calls to cause negative consequences? I'm not sure I
agree, but accepting this idea for the sake of discussion: shouldn't
we be more okay with system calls for features present on almost all
systems --- like procfs --- even if we punt to ioctls very rarely-used
functionality, e.g., some hypothetical special squeak noise that you
could get some specific 1995 adlib clone to make?
Besides, the translation is
just there because it is racy to do in userspace,  it is not some well
defined core kernel functionality.
Therefore, it is just a way to
enter the kernel to do the openat in a race free and safe manner.
I agree that the translation has to be done in the kernel, not
userspace, and that the kernel must provide to userspace some
interface for requesting that the translation happen: we're just
discussing the shape of this interface. Shouldn't all interfaces
provided by the kernel to userspace be equally well defined? I'm not
sure that the internal simplicity of the operation matters much
either. There are already explicit system calls for some
simple-to-implement things, e.g., timerfd_gettime. It's worth noting
that timerfd is (IIRC), like procfs, a feature that's both ubiquitous
and optional.
As is, the facility being provided through an ioctl on the pidfd is
not something I'd consider a problem.
You're right that from a signature perspective, using an ioctl isn't a
problem. I just want to make sure we take into account the other,
non-signature advantages that system calls have over ioctls.
I think the translation stuff
should also probably be an extension of ioctl_ns(2) (but I wouldn't be
opposed if translate_pid is resurrected as is).
For anything more involved than ioctl(pidfd, PIDFD_TO_PROCFD,
procrootfd), I'd agree that a system call would be a cleaner
interface, otherwise, if you cannot generalise it, using ioctls as a
command interface is probably the better tradeoff here.
Sure: I just want to better understand everyone else's thought process
here, having been frustrated with things like the termios ioctls being
ioctls.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help