Thread (8 messages) 8 messages, 3 authors, 2023-01-07

Re: [PATCH] nfsd: fix potential race in nfs4_find_file

From: Chuck Lever III <hidden>
Date: 2023-01-05 21:37:04

On Jan 5, 2023, at 3:43 PM, Jeff Layton [off-list ref] wrote:

On Thu, 2023-01-05 at 14:46 +0000, Chuck Lever III wrote:
quoted
quoted
On Jan 5, 2023, at 7:18 AM, Jeff Layton [off-list ref] wrote:

Even though there is a WARN_ON_ONCE check, it seems possible for
nfs4_find_file to race with the destruction of an fi_deleg_file while
trying to take a reference to it.

put_deleg_file is done while holding the fi_lock. Take and hold it
when dealing with the fi_deleg_file in nfs4_find_file.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/nfs4state.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b68238024e49..3df3ae84bd07 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6417,23 +6417,27 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
static struct nfsd_file *
nfs4_find_file(struct nfs4_stid *s, int flags)
{
+	struct nfsd_file *ret = NULL;
+
	if (!s)
		return NULL;

	switch (s->sc_type) {
	case NFS4_DELEG_STID:
-		if (WARN_ON_ONCE(!s->sc_file->fi_deleg_file))
-			return NULL;
-		return nfsd_file_get(s->sc_file->fi_deleg_file);
+		spin_lock(&s->sc_file->fi_lock);
+		if (!WARN_ON_ONCE(!s->sc_file->fi_deleg_file))
You'd think this would be a really really hard race to hit.

What I'm wondering, though, is whether the WARN_ON_ONCE should
be dropped by this patch. I've never seen it fire.
I have:

   https://bugzilla.redhat.com/show_bug.cgi?id=1997177
It's possible though that those WARNs are fallout from other bugs in the
delegation handling, but it's hard to know for sure.
Before 2015 there were a bunch of BUG_ON's in this code that
were converted to WARN after Linus complained. Before that,
I think these were all debugging sentinels. (in which case
I would argue they might be better recast as tracepoints,
but that's for another day).

I think we ought to keep it there for now.
The question is whether the WARN_ON is adding value for customers.
Can they do something about it? If they give us this information,
can we do something about it?

I can't tell from the warning whether the problem is due to a
server bug or valid client behavior. Both the server and the
client workload appear to survive.

So, I just don't feel like it's adding value, and firing a WARN
while holding a spinlock makes me squidgy.

quoted
quoted
+			ret = nfsd_file_get(s->sc_file->fi_deleg_file);
+		spin_unlock(&s->sc_file->fi_lock);
+		break;
	case NFS4_OPEN_STID:
	case NFS4_LOCK_STID:
		if (flags & RD_STATE)
-			return find_readable_file(s->sc_file);
+			ret = find_readable_file(s->sc_file);
		else
-			return find_writeable_file(s->sc_file);
+			ret = find_writeable_file(s->sc_file);
	}

-	return NULL;
+	return ret;
}

static __be32
-- 
2.39.0
--
Chuck Lever

-- 
Jeff Layton [off-list ref]
--
Chuck Lever


Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help