Re: deadlock in RELEASE_LOCKOWNER path
From: NeilBrown <hidden>
Date: 2024-02-05 00:56:38
Subsystem:
filesystems (vfs and infrastructure), kernel nfsd, sunrpc, and lockd servers, the rest · Maintainers:
Alexander Viro, Christian Brauner, Chuck Lever, Jeff Layton, Linus Torvalds
On Mon, 05 Feb 2024, Chuck Lever III wrote:
Hi Neil-
I'm testing v6.8-rc3 + nfsd-next. This series includes:
nfsd: fix RELEASE_LOCKOWNER
and
nfsd: don't call locks_release_private() twice concurrently
======================================================
WARNING: possible circular locking dependency detected
6.8.0-rc3-00052-gc20ad5c2d60c #1 Not tainted
------------------------------------------------------
nfsd/1218 is trying to acquire lock:
ffff88814d754198 (&ctx->flc_lock){+.+.}-{2:2}, at: check_for_locks+0xf6/0x1d0 [nfsd]
but task is already holding lock:
ffff8881210e61f0 (&fp->fi_lock){+.+.}-{2:2}, at: check_for_locks+0x2d/0x1d0 [nfsd]
which lock already depends on the new lock.I should have found this when I was checking on flc_lock... sorry. I think this could actually deadlock if a lease was being broken on a file at the same time that some interesting file locking was happening ... though that might not actually be possible as conflicting locks and delegations rarely mix. Still - we should fix it. One approach would be to split flc_lock into two locks, one for leases and one for posix+flock locks. That would avoid this conflict, but isn't particularly elegant. I'm not at all certain that nfsd_break_deleg_cb() needs to take fi_lock. In earlier code it needed to walk ->fi_delegations and that would need the lock - but that was removed in v4.17-rc1~110^2~22. The only remaining possible need for fi_lock is fi_had_conflict. nfsd_break_deleg_cb() take the lock when setting the flag, and nfsd_set_delegation() takes the lock when testing the flag. I cannot see why the strong exclusion is needed. We test fi_had_conflict early in nfsd_set_delegation as an optimisation, and that makes sense. Test it again after calling vfs_setlease() to get the lease makes sense in case there was some other grant/revoke before the early test and the vfs_setlease(). But once vfs_setlease has succeed and we confirmed no conflict yet, I cannot see why we would abort the least. If a revoke came in while nfsd_set_deletation() is running the result doesn't need to be different to if a revoke comes in just aster nfsd_set_delegation() completes.... Does it? So I think we can drop the lock frome break_deleg_cb() and make the importance of fi_had_conflict explicit by moving the test out of the lock. e.g.
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 12534e12dbb3..8b112673d389 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c@@ -5154,10 +5154,8 @@ nfsd_break_deleg_cb(struct file_lock *fl) */ fl->fl_break_time = 0; - spin_lock(&fp->fi_lock); fp->fi_had_conflict = true; nfsd_break_one_deleg(dp); - spin_unlock(&fp->fi_lock); return false; }
@@ -5771,13 +5769,14 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, if (status) goto out_unlock; + status = -EAGAIN; + if (fp->fi_had_conflict) + goto out_unlock; + spin_lock(&state_lock); spin_lock(&clp->cl_lock); spin_lock(&fp->fi_lock); - if (fp->fi_had_conflict) - status = -EAGAIN; - else - status = hash_delegation_locked(dp, fp); + status = hash_delegation_locked(dp, fp); spin_unlock(&fp->fi_lock); spin_unlock(&clp->cl_lock); spin_unlock(&state_lock);
fi_had_conflict is set under flc_lock, and vfs_setlease() will take flc_lock, so while the set and test may appear lockless, they aren't. I'll do some basic testing and send a proper patch for your consideration. Thanks NeilBrown