Thread (12 messages) 12 messages, 4 authors, 2025-06-11

Re: [nfsv4] Re: simple NFSv4.1/4.2 test of remove while holding a delegation

From: Jeff Layton <jlayton@kernel.org>
Date: 2025-06-10 11:51:10

On Mon, 2025-06-09 at 18:06 -0700, Rick Macklem wrote:
On Mon, Jun 9, 2025 at 5:17 PM Dai Ngo [off-list ref] wrote:
quoted
On 6/9/25 4:35 PM, Rick Macklem wrote:
quoted
Hi,

I hope you don't mind a cross-post, but I thought both groups
might find this interesting...

I have been creating a compound RPC that does REMOVE and
then tries to determine if the file object has been removed and
I was surprised to see quite different results from the Linux knfsd
and Solaris 11.4 NFSv4.1/4.2 servers. I think both these servers
provide FH4_PERSISTENT file handles, although I suppose I
should check that?

First, the test OPEN/CREATEs a regular file called "foo" (only one
hard link) and acquires a write delegation for it.
Then a compound does the following:
...
REMOVE foo
PUTFH fh for foo
GETATTR

For the Solaris 11.4 server, the server CB_RECALLs the
delegation and then replies NFS4ERR_STALE for the PUTFH above.
(The FreeBSD server currently does the same.)

For a fairly recent Linux (6.12) knfsd, the above replies NFS_OK
with nlinks == 0 in the GETATTR reply.

Hmm. So I've looked in RFC8881 (I'm terrible at reading it so I
probably missed something) and I cannot find anything that states
either of the above behaviours is incorrect.
This seems outside the scope of the spec. What you're probably seeing
is just differences in the implementation details of the two servers.
quoted
quoted
(NFS4ERR_STALE is listed as an error code for PUTFH, but the
description of PUTFH only says that it sets the CFH to the fh arg.
It does not say anything w.r.t. the fh arg. needing to be for a file
that still exists.) Neither of these servers sets
OPEN4_RESULT_PRESERVE_UNLINKED in the OPEN reply.

So, it looks like "file object no longer exists" is indicated either
by a NFS4ERR_STALE reply to either PUTFH or GETATTR
OR
by a successful reply, but with nlinks == 0 for the GETATTR reply.

To be honest, I kinda like the Linux knfsd version, but I am wondering
if others think that both of these replies is correct?

Also, is the CB_RECALL needed when the delegation is held by
the same client as the one doing the REMOVE?
The Linux NFSD detects the delegation belongs to the same client that
causes the conflict (due to REMOVE) and skips the CB_RECALL. This is
an optimization based on the assumption that the client would handle
the conflict locally.
And then what does the server do with the delegation?
- Does it just discard it, since the file object has been deleted?
OR
- Does it guarantee that a DELEGRETURN done after the REMOVE will
  still work (which seems to be the case for the 6.12 server I am using for
  testing).
The latter. The file on the server is still being held open by virtue
of the fact that the client holds a delegation stateid on it.

The inode will still exist in core (with nlinks == 0) until its last
reference is released (here, when the client does the final
DELEGRETURN). Aside from the fact that it's now disconnected from the
filesystem namespace, it's still "alive", and reachable via filehandle.
quoted
If the REMOVE was done by another client, the REMOVE will not complete
until the delegation is returned. If the PUTFH comes after the REMOVE
was completed, it'll  fail with NFS4ERR_STALE since the file, specified
by the file handle, no longer exists.
Assuming the statement w.r.t. "fail with NFS4ERR_STALE" only applies to
"REMOVE done by another client" then that sounds fine.
However if the "fail with NFS4ERR_STALE is supposed for happen after
REMOVE for same client" then that is not what I am seeing.
If you are curious, the packet trace is here. (Look at packet#58).
https://people.freebsd.org/~rmacklem/linux-remove.pcap

Btw, in case you are curious why I am doing this testing, I am trying
to figure out a good way for the FreeBSD client to handle temporary
files. Typically on POSIX they are done via the syscalls:

fd = open("foo", O_CREATE ...);
unlink("foo");
write(fd,..), write(fd,..)...
read(fd,...), read(fd,...)...
close(fd);

If this happens quickly and is not too much writing, the writes
copy data into buffers/pages, the reads read the data out of
the pages and then it all gets deleted.
Yep, common pattern.
Unfortunately, the CB_RECALL forces the NFSv4.n client
to do WRITE, WRITE,..COMMIT and then DELEGRETURN.
Then the REMOVE throws all the data away on the NFSv4.n
server.
--> As such, I really like not doing the CB_RECALL for "same client".
My concern is "what happens to the delegation after the file object ("foo")
gets deleted?
It either needs to be thrown away by the NFSv4.n server or the
PUTFH, DELEGRETURN needs to work after the REMOVE.
I think the latter. A REMOVE just removes the filename from the
namespace. What happens to the underlying inode/vnode/whathaveyou is
undefined by the protocol. The delegation is effectively holding the
file open, so it needs to continue to exist on the server, just as the
file "foo" in your example above must exist after the unlink().
Otherwise, the NFSv4.n server may get constipated by the delegations,
which might be called stale, since the file object has been deleted.

--> I can do PUTFH, GETATTR after REMOVE in the same compound,
     to find out if the file object has been deleted. But then, if a
     PUTFH, DELEGRETURN fails with NFS4ERR_STALE, can I get
     away with saying "the server should just discard the delegation as
     the client already has done so??.

Thanks for your comments, rick
If you still have an outstanding delegation after a REMOVE, then
returning ESTALE on the filehandle at that point seems wrong. The
delegation still exists, so the underlying filehandle should still
exist.

Linux doesn't generally throw back an NFS4ERR_STALE until it just can't
find the inode at all anymore. A dentry holds a reference to the inode,
and open files hold a reference to the dentry. The remove just
disconnects the dentry from the namespace and drops its refcount. When
the DELEGRETURN issues the last close, the inode gets cleaned up and at
that point you can't find it by filehandle anymore.

You probably want to aim for similar behavior in FreeBSD?
quoted
-Dai
quoted
(I don't think it is, but there is a discussion in 18.25.4 which says
"When the determination above cannot be made definitively because
delegations are being held, they MUST be recalled.." but everything
above that is a may/MAY, so it is not obvious to me if a server really
needs to case?)

Any comments? Thanks, rick
ps: I am amazed when I learn these things about NFSv4.n after all
       these years.

-- 
Jeff Layton [off-list ref]
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help