Re: [PATCH] lockd: set other missing fields when unlocking files
From: Chuck Lever III <chuck.lever@oracle.com>
Date: 2022-11-07 21:55:52
On Nov 7, 2022, at 5:48 AM, Jeff Layton [off-list ref] wrote: On Sun, 2022-11-06 at 14:02 -0500, trondmy@kernel.org wrote:quoted
From: Trond Myklebust <redacted> vfs_lock_file() expects the struct file_lock to be fully initialised by the caller.
As a reviewer, I don't see anything in the vfs_lock_file() kdoc comment that suggests this, and vfs_lock_file() itself is just a wrapper around each filesystem's f_ops->lock method. That expectation is a bit deeper into NFS-specific code. A few more observations below.
quoted
Re-exported NFSv3 has been seen to Oops if the fl_file field is NULL.
Needs a Link: to the bug report. Which I can add. This will also give us a call trace we can reference, so I won't add that here.
quoted
Fixes: aec158242b87 ("lockd: set fl_owner when unlocking files") Signed-off-by: Trond Myklebust <redacted> --- fs/lockd/svcsubs.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c index e1c4617de771..3515f17eaf3f 100644 --- a/fs/lockd/svcsubs.c +++ b/fs/lockd/svcsubs.c@@ -176,7 +176,7 @@ nlm_delete_file(struct nlm_file *file)} } -static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner) +static int nlm_unlock_files(struct nlm_file *file, const struct file_lock *fl) { struct file_lock lock;@@ -184,12 +184,15 @@ static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner)lock.fl_type = F_UNLCK; lock.fl_start = 0; lock.fl_end = OFFSET_MAX; - lock.fl_owner = owner; - if (file->f_file[O_RDONLY] && - vfs_lock_file(file->f_file[O_RDONLY], F_SETLK, &lock, NULL)) + lock.fl_owner = fl->fl_owner; + lock.fl_pid = fl->fl_pid; + lock.fl_flags = FL_POSIX; + + lock.fl_file = file->f_file[O_RDONLY]; + if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL)) goto out_err; - if (file->f_file[O_WRONLY] && - vfs_lock_file(file->f_file[O_WRONLY], F_SETLK, &lock, NULL)) + lock.fl_file = file->f_file[O_WRONLY]; + if (lock.fl_file && vfs_lock_file(lock.fl_file, F_SETLK, &lock, NULL)) goto out_err; return 0; out_err:@@ -226,7 +229,7 @@ nlm_traverse_locks(struct nlm_host *host, struct nlm_file *file,if (match(lockhost, host)) { spin_unlock(&flctx->flc_lock); - if (nlm_unlock_files(file, fl->fl_owner)) + if (nlm_unlock_files(file, fl)) return 1; goto again; }Good catch. I wonder if we ought to roll an initializer function for file_locks to make it harder for callers to miss setting some fields like this? One idea: we could change vfs_lock_file to *not* take a file argument, and insist that the caller fill out fl_file when calling it? That would make it harder to screw this up.
Commit history shows that, at least as far back as the beginning of the git era, the vfs_lock_file() call site here did not initialize the fl_file field. So, this code has been working without fully initializing @fl for, like, forever. Trond later says:
The regression occurs in 5.16, because that was when Bruce merged his patches to enable locking when doing NFS re-exporting.
That means the Fixes: tag above is misleading. The proposed patch doesn't actually fix that commit (which went into v5.19), it simply applies on that commit. I haven't been able to find the locking patches mentioned here. I think those bear mentioning (by commit ID) in the patch description, at least. If you know the commit ID, Trond, can you pass it along? Though I would say that, in agreement with Jeff, the true cause of this issue is the awkward synopsis for vfs_lock_file(). -- Chuck Lever