Thread (74 messages) 74 messages, 9 authors, 2011-11-12

Re: [PATCH 00/18] IO-less dirty throttling v11

From: Trond Myklebust <hidden>
Date: 2011-09-07 19:14:46
Also in: linux-fsdevel, lkml

On Wed, 2011-09-07 at 21:32 +0800, Wu Fengguang wrote: 
quoted hunk ↗ jump to hunk
quoted
Finally, the complete IO-less balance_dirty_pages(). NFS is observed to perform
better or worse depending on the memory size. Otherwise the added patches can
address all known regressions.
I find that the NFS performance regressions on large memory system can
be fixed by this patch. It tries to make the progress more smooth by
reasonably reducing the commit size.

Thanks,
Fengguang
---
Subject: nfs: limit the commit size to reduce fluctuations
Date: Thu Dec 16 13:22:43 CST 2010

Limit the commit size to half the dirty control scope, so that the
arrival of one commit will not knock the overall dirty pages off the
scope.

Also limit the commit size to one second worth of data. This will
obviously help make the pipeline run more smoothly.

Also change "<=" to "<": if an inode has only one dirty page in the end,
it should be committed. I wonder why the "<=" didn't cause a bug...

CC: Trond Myklebust <redacted>
Signed-off-by: Wu Fengguang <redacted>
---
 fs/nfs/write.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

After patch, there are still drop offs from the control scope,

http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/NFS/nfs-1dd-1M-8p-2945M-20%25-2.6.38-rc6-dt6+-2011-02-22-21-09/balance_dirty_pages-pages.png

due to bursty arrival of commits:

http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/NFS/nfs-1dd-1M-8p-2945M-20%25-2.6.38-rc6-dt6+-2011-02-22-21-09/nfs-commit.png
--- linux-next.orig/fs/nfs/write.c	2011-09-07 21:29:15.000000000 +0800
+++ linux-next/fs/nfs/write.c	2011-09-07 21:29:32.000000000 +0800
@@ -1543,10 +1543,14 @@ static int nfs_commit_unstable_pages(str
 	int ret = 0;
 
 	if (wbc->sync_mode == WB_SYNC_NONE) {
+		unsigned long bw = MIN_WRITEBACK_PAGES +
+			NFS_SERVER(inode)->backing_dev_info.avg_write_bandwidth;
+
 		/* Don't commit yet if this is a non-blocking flush and there
-		 * are a lot of outstanding writes for this mapping.
+		 * are a lot of outstanding writes for this mapping, until
+		 * collected enough pages to commit.
 		 */
-		if (nfsi->ncommit <= (nfsi->npages >> 1))
+		if (nfsi->ncommit < min(nfsi->npages / DIRTY_SCOPE, bw))
 			goto out_mark_dirty;
 
 		/* don't wait for the COMMIT response */
So what goes into the 'avg_write_bandwidth' variable that makes it a
good measure above (why 1 second of data instead of 10 seconds or
1ms, ...)? What is the 'DIRTY_SCOPE' value?

IOW: what new black magic are we introducing above and why is it so
obviously better than what we have (yes, I see you have graphs, but that
is just measuring _one_ NFS setup and workload).

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help