Thread (4 messages) 4 messages, 2 authors, 2023-01-26

Re: [PATCH/RFC] NFSv3 handle out-of-order write replies

From: Jeff Layton <jlayton@kernel.org>
Date: 2023-01-26 21:39:08

On Fri, 2023-01-27 at 07:36 +1100, NeilBrown wrote:
On Thu, 26 Jan 2023, Jeff Layton wrote:
quoted
On Wed, 2023-01-25 at 13:03 +1100, NeilBrown wrote:
quoted
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 7931fa472561..8a83d6d204ed 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -190,6 +190,33 @@ struct nfs_inode {
 	/* Open contexts for shared mmap writes */
 	struct list_head	open_files;
 
+	/* Keep track of out-of-order replies.
+	 * The ooo array contains start/end pairs of
+	 * number from the changeid sequence when
+	 * the inodes iversion has been updated.
+	 * It also contains end/start pair (i.e. reverse order)
+	 * of sections of the changeid sequence that have
+	 * been seen in replies from the server.
+	 * Normally these should match and when both
+	 * A:B and B:A are found in ooo, they are both removed.
+	 * And if a reply with A:B causes an iversion update
+	 * of A:B, then neither are added.
+	 * When a reply has pre_change that doesn't match
+	 * iversion, then the changeid pair, and any consequent
+	 * change in iversion ARE added.  Later replies
+	 * might fill in the gaps, or possibly a gap is caused
+	 * by a change from another client.
+	 * When a file or directory is opened, if the ooo table
+	 * is not empty, then we assume the gaps were due to
+	 * another client and we invalidate the cached data.
+	 *
+	 * We can only track a limited number of concurrent gaps.
+	 * Currently that limit is 16.
+	 */
+	int ooo_cnt;
+	int ooo_max; // TRACING
+	unsigned long ooo[16][2];
Why unsigned longs here? Shouldn't these be u64?
Yes, they should be u64.  Thanks.
quoted
I guess you could argue that when we have 32-bit longs that the most
significant bits don't matter, but that's also the case with 64-bit
longs, and 32 bits would halve the space requirements.

Also, this grows each inode by 2k on a 64-bit arch! Maybe we should
dynamically allocate these things instead? If the allocation fails, then
we could just go back to marking the cache invalid and move on.
2K? 8*16*2+4*2 == 264, not 2048.
You're correct of course. I'm not sure where I got 2048.
But I agree that allocating on demand would make sense.  16 is probably
more than needed.  I don't have proper testing results from the customer
yet, but I wouldn't be surprised if a smaller number would suffice.  But
if we are allocating only on demand, it wouldn't hurt to allocate 16.
Makes sense. Is there a max number of NFS writes you can have in flight
to a single server in the client? That might inform how large an array
you need.
-- 
Jeff Layton [off-list ref]
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help