Re: [PATCH 00/13] VFS: Filesystem information [ver #19]
From: Miklos Szeredi <miklos@szeredi.hu>
Date: 2020-03-18 16:06:07
Also in:
linux-api, linux-ext4, linux-fsdevel, linux-nfs, lkml
On Wed, Mar 18, 2020 at 4:08 PM David Howells [off-list ref] wrote:
============================
WHY NOT USE PROCFS OR SYSFS?
============================
Why is it better to go with a new system call rather than adding more magic
stuff to /proc or /sysfs for each superblock object and each mount object?
(1) It can be targetted. It makes it easy to query directly by path.
procfs and sysfs cannot do this easily.
(2) It's more efficient as we can return specific binary data rather than
making huge text dumps. Granted, sysfs and procfs could present the
same data, though as lots of little files which have to be
individually opened, read, closed and parsed.Asked this a number of times, but you haven't answered yet: what application would require such a high efficiency? Nobody's suggesting we move stat(2) to proc interfaces, and AFAIK nobody suggested we move /proc/PID/* to a binary syscall interface. Each one has its place, and I strongly feel that mount info belongs in the latter category. Feel free to prove the opposite.
(3) We wouldn't have the overhead of open and close (even adding a
self-contained readfile() syscall has to do that internallyBusted: add f_op->readfile() and be done with all that. For example DEFINE_SHOW_ATTRIBUTE() could be trivially moved to that interface. We could optimize existing proc, sys, etc. interfaces, but it's not been an issue, apparently.
(4) Opening a file in procfs or sysfs has a pathwalk overhead for each
file accessed. We can use an integer attribute ID instead (yes, this
is similar to ioctl) - but could also use a string ID if that is
preferred.
(5) Can easily query cross-namespace if, say, a container manager process
is given an fs_context that hasn't yet been mounted into a namespace -
or hasn't even been fully created yet.Works with my patch.
(6) Don't have to create/delete a bunch of sysfs/procfs nodes each time a
mount happens or is removed - and since systemd makes much use of
mount namespaces and mount propagation, this will create a lot of
nodes.Not true.
The argument for doing this through procfs/sysfs/somemagicfs is that someone using a shell can just query the magic files using ordinary text tools, such as cat - and that has merit - but it doesn't solve the query-by-pathname problem. The suggested way around the query-by-pathname problem is to open the target file O_PATH and then look in a magic directory under procfs corresponding to the fd number to see a set of attribute files[*] laid out. Bash, however, can't open by O_PATH or O_NOFOLLOW as things stand...
Bash doesn't have fsinfo(2) either, so that's not really a good argument. Implementing a utility to show mount attribute(s) by path is trivial for the file based interface, while it would need to be updated for each extension of fsinfo(2). Same goes for libc, language bindings, etc. Thanks, Miklos