Thread (83 messages) 83 messages, 18 authors, 2026-03-05

Re: [PATCH 00/24] vfs: require filesystems to explicitly opt-in to lease support

From: Chuck Lever <hidden>
Date: 2026-01-13 14:05:08
Also in: ceph-devel, gfs2, linux-btrfs, linux-cifs, linux-ext4, linux-f2fs-devel, linux-fsdevel, linux-mm, linux-nfs, linux-unionfs, linux-xfs, lkml, ntfs3, ocfs2-devel, v9fs

On 1/13/26 6:45 AM, Jeff Layton wrote:
On Tue, 2026-01-13 at 09:54 +0100, Christian Brauner wrote:
quoted
On Mon, Jan 12, 2026 at 09:50:20AM -0500, Jeff Layton wrote:
quoted
On Mon, 2026-01-12 at 09:31 -0500, Chuck Lever wrote:
quoted
On 1/12/26 8:34 AM, Jeff Layton wrote:
quoted
On Fri, 2026-01-09 at 19:52 +0100, Amir Goldstein wrote:
quoted
On Thu, Jan 8, 2026 at 7:57 PM Jeff Layton [off-list ref] wrote:
quoted
On Thu, 2026-01-08 at 18:40 +0100, Jan Kara wrote:
quoted
On Thu 08-01-26 12:12:55, Jeff Layton wrote:
quoted
Yesterday, I sent patches to fix how directory delegation support is
handled on filesystems where the should be disabled [1]. That set is
appropriate for v6.19. For v7.0, I want to make lease support be more
opt-in, rather than opt-out:

For historical reasons, when ->setlease() file_operation is set to NULL,
the default is to use the kernel-internal lease implementation. This
means that if you want to disable them, you need to explicitly set the
->setlease() file_operation to simple_nosetlease() or the equivalent.

This has caused a number of problems over the years as some filesystems
have inadvertantly allowed leases to be acquired simply by having left
it set to NULL. It would be better if filesystems had to opt-in to lease
support, particularly with the advent of directory delegations.

This series has sets the ->setlease() operation in a pile of existing
local filesystems to generic_setlease() and then changes
kernel_setlease() to return -EINVAL when the setlease() operation is not
set.

With this change, new filesystems will need to explicitly set the
->setlease() operations in order to provide lease and delegation
support.

I mainly focused on filesystems that are NFS exportable, since NFS and
SMB are the main users of file leases, and they tend to end up exporting
the same filesystem types. Let me know if I've missed any.
So, what about kernfs and fuse? They seem to be exportable and don't have
.setlease set...
Yes, FUSE needs this too. I'll add a patch for that.

As far as kernfs goes: AIUI, that's basically what sysfs and resctrl
are built on. Do we really expect people to set leases there?

I guess it's technically a regression since you could set them on those
sorts of files earlier, but people don't usually export kernfs based
filesystems via NFS or SMB, and that seems like something that could be
used to make mischief.

AFAICT, kernfs_export_ops is mostly to support open_by_handle_at(). See
commit aa8188253474 ("kernfs: add exportfs operations").

One idea: we could add a wrapper around generic_setlease() for
filesystems like this that will do a WARN_ONCE() and then call
generic_setlease(). That would keep leases working on them but we might
get some reports that would tell us who's setting leases on these files
and why.
IMO, you are being too cautious, but whatever.

It is not accurate that kernfs filesystems are NFS exportable in general.
Only cgroupfs has KERNFS_ROOT_SUPPORT_EXPORTOP.

If any application is using leases on cgroup files, it must be some
very advanced runtime (i.e. systemd), so we should know about the
regression sooner rather than later.
I think so too. For now, I think I'll not bother with the WARN_ONCE().
Let's just leave kernfs out of the set until someone presents a real
use-case.
quoted
There are also the recently added nsfs and pidfs export_operations.

I have a recollection about wanting to be explicit about not allowing
those to be exportable to NFS (nsfs specifically), but I can't see where
and if that restriction was done.

Christian? Do you remember?
(cc'ing Chuck)

FWIW, you can currently export and mount /sys/fs/cgroup via NFS. The
directory doesn't show up when you try to get to it via NFSv4, but you
can mount it using v3 and READDIR works. The files are all empty when
you try to read them. I didn't try to do any writes.

Should we add a mechanism to prevent exporting these sorts of
filesystems?

Even better would be to make nfsd exporting explicitly opt-in. What if
we were to add a EXPORT_OP_NFSD flag that explicitly allows filesystems
to opt-in to NFS exporting, and check for that in __fh_verify()? We'd
have to add it to a bunch of existing filesystems, but that's fairly
simple to do with an LLM.
What's the active harm in exporting /sys/fs/cgroup ? It has to be done
explicitly via /etc/exports, so this is under the NFS server admin's
control. Is it an attack surface?
Potentially?

I don't see any active harm with exporting cgroupfs. It doesn't work
right via nfsd, but it's not crashing the box or anything.

At one time, those were only defined by filesystems that wanted to
allow NFS export. Now we've grown them on filesystems that just want to
provide filehandles for open_by_handle_at() and the like. nfsd doesn't
care though: if the fs has export operations, it'll happily use them.

Having an explicit "I want to allow nfsd" flag see ms like it might
save us some headaches in the future when other filesystems add export
ops for this sort of filehandle use.
So we are re-hashing a discussion we had a few months ago (Amir was
involved at least).
Yep, I was lurking on it, but didn't have a lot of input at the time.
quoted
I don't think we want to expose cgroupfs via NFS that's super weird.
It's like remote partial resource management and it would be very
strange if a remote process suddenly would be able to move things around
in the cgroup tree. So I would prefer to not do this.

So my preference would be to really sever file handles from the export
mechanism so that we can allow stuff like pidfs and nsfs and cgroupfs to
use file handles via name_to_handle_at() and open_by_handle_at() without
making them exportable.
Agreed. I think we want to make NFS export be a deliberate opt-in
decision that filesystem developers make.
No objection, what about ksmbd, AFS, or Ceph?

How we do that is up for
debate, of course.

An export ops flag would be fairly simple to implement, but it sounds
like you're thinking that we should split some export_operations into
struct file_handle_operations and then add a pointer for that to
super_block (and maybe to export_operations too)?

This would be a good LSF/MM topic, but I'm hoping we can come to a
consensus before then.

-- 
Chuck Lever
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help