[PATCH/RFC] NFS: fix use-after-free with O_DIRECT write

From: NeilBrown <hidden>
Date: 2023-06-06 03:34:10
Subsystem: filesystems (vfs and infrastructure), nfs, sunrpc, and lockd clients, the rest · Maintainers: Alexander Viro, Christian Brauner, Trond Myklebust, Anna Schumaker, Linus Torvalds

If the page size is greater than the wsize, O_DIRECT writes are broken
up into multiple sub-requests (subreqs) per page.
If there are two subreqs for a given page, one (not the head)
succeeds and the other succeeds but sees a write verifier mis-match, we
get a problem.

The first subreq will have been released (nfs_release_request) and will
have a refcount of zero and PG_TEARDOWN will be set.
The remainder of the request will be passed to
nfs_direct_write_reschedule() and thence to nfs_direct_join_group().

nfs_direct_join_group() calls nfs_release_request() on each non-head
subreq, and this results in a refcounter underflow.

The list is then passed to nfs_join_page_group() which should probably
ignore these completed subreqs too, though there is no serious problem
caused by not skipping them.  It finally gets to
nfs_destroy_unlinked_subrequests() which handles these unrefed subreqs
correctly.

This behaviour has been seen on a ppc64 machine with 64K page size,
mounting with NFSv3 an rsize=wsize=32768.  The server-side filesystem
fills up causing the Linux NFS server to report ENOSPC and to update
the write verifier.

This patch adds tests on PG_TEARDOWN and skips those subrequests in
nfs_direcf_join_group().  It doesn't make changes to
nfs_join_page_group().

If a "head" subreq succeeds but other subreqs fail,
nfs_direct_join_group() will not join those subreqs back together.  I
doubt this is correct, but I haven't yet demonstrated a crash, or worked
through all the consequences.

Signed-off-by: NeilBrown <redacted>
---
 fs/nfs/direct.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 9a18c5a69ace..d41d4b869d42 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -483,8 +483,10 @@ nfs_direct_join_group(struct list_head *list, struct inode *inode)
 		for (next = req->wb_this_page;
 				next != req->wb_head;
 				next = next->wb_this_page) {
-			nfs_list_remove_request(next);
-			nfs_release_request(next);
+			if (!test_bit(PG_TEARDOWN, &next->wb_flags)) {
+				nfs_list_remove_request(next);
+				nfs_release_request(next);
+			}
 		}
 		nfs_join_page_group(req, inode);
 	}
-- 
2.40.1
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help