Re: [PATCH 2/2] NFSv4: Allow per-mount tuning of READDIR attrs

From: Benjamin Coddington <hidden>
Date: 2023-10-18 14:27:20

On 18 Oct 2023, at 9:33, Jeff Layton wrote:

On Wed, 2023-10-18 at 08:56 -0400, Chuck Lever wrote:

quoted

On Tue, Oct 17, 2023 at 05:30:44PM -0400, Benjamin Coddington wrote:

quoted

Expose a per-mount knob in sysfs to set the READDIR requested attributes
for a non-plus READDIR request.

For example:

  echo 0x800 0x800000 0x0 > /sys/fs/nfs/0\:57/v4_readdir_attrs

.. will revert the client to only request rdattr_error and
mounted_on_fileid for any non "plus" READDIR, as before the patch
preceeding this one in this series.  This provides existing installations
an option to fix a potential performance regression that may occur after
NFS clients update to request additional default READDIR attributes.

Signed-off-by: Benjamin Coddington <redacted>
---
 fs/nfs/client.c           |  2 +
 fs/nfs/nfs4client.c       |  4 ++
 fs/nfs/nfs4proc.c         |  1 +
 fs/nfs/nfs4xdr.c          |  7 ++--
 fs/nfs/sysfs.c            | 81 +++++++++++++++++++++++++++++++++++++++
 include/linux/nfs_fs_sb.h |  1 +
 include/linux/nfs_xdr.h   |  1 +
 7 files changed, 93 insertions(+), 4 deletions(-)

Admittedly, it would be much easier for humans to use if the API was
based on the symbolic names of the bits rather than a triplet of raw
hexadecimal values.

This isn't aiming to be an ease-of-use interface.  This is tinkering with
the innards of the client.  If you're doing this, you better know how to
convert between bases, because you're going to need that and more.

If we want to make it nice, patches to nfsctl can follow.

I think there are some significant footguns with this interface. It'd be
very easy to set this wrong and get weird behavior.  OTOH, we could push
that complexity into userland and provide some sort of script in nfs-
utils for tuning this.

That said...

When we look at interfaces like this, we have to consider that they may
be around for a long, long time (decades, even), and people will come to
rely on them to do strange things that are difficult for us to support.
If we have someone saying that their READDIR performance slowed down, we
now have to grab those settings from this sysfs file and validate them
when trying to help them.

Personally, I'd prefer a simple binary "make it work the old way"
switch, if we're concerned about performance regressions here. I think
that's the sort of thing that is simple to explain to admins that are
suffering from this problem and (more importantly) the sort of setting
we can later remove when it's no longer needed.

Adding this sort of fine-grained knob will create more problems than it
solves, as people will (inevitably) use it incorrectly.

I disagree that it will create more problems than it solves.

Also, sysfs isn't there for you to experiment with in production, and
sysadmins know this.  Sysfs is "_The_ filesystem for exporting kernel
objects".   There are plenty of ways to hose a system and corrupt data by
playing around with sysfs.

If we take the position that everything in NFS' sysfs must have a higher
standard of safety than even module parameters (see recover_lost_locks),
that means we better look at making every sysfs interface non-shoot-footy,
which is just insane.  Just take a look at a sampling of writeable files,
here's a couple:

/sys/block/sda/device/delete
/sys/kernel/sunrpc/xprt-switches/switch-1/xprt-0-local/dstaddr

Ben

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help