Re: [PATCH 00/18] IO-less dirty throttling v11
From: Trond Myklebust <hidden>
Date: 2011-09-07 19:14:46
Also in:
linux-fsdevel, lkml
On Wed, 2011-09-07 at 21:32 +0800, Wu Fengguang wrote:
quoted hunk ↗ jump to hunk
quoted
Finally, the complete IO-less balance_dirty_pages(). NFS is observed to perform better or worse depending on the memory size. Otherwise the added patches can address all known regressions.I find that the NFS performance regressions on large memory system can be fixed by this patch. It tries to make the progress more smooth by reasonably reducing the commit size. Thanks, Fengguang --- Subject: nfs: limit the commit size to reduce fluctuations Date: Thu Dec 16 13:22:43 CST 2010 Limit the commit size to half the dirty control scope, so that the arrival of one commit will not knock the overall dirty pages off the scope. Also limit the commit size to one second worth of data. This will obviously help make the pipeline run more smoothly. Also change "<=" to "<": if an inode has only one dirty page in the end, it should be committed. I wonder why the "<=" didn't cause a bug... CC: Trond Myklebust <redacted> Signed-off-by: Wu Fengguang <redacted> --- fs/nfs/write.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) After patch, there are still drop offs from the control scope, http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/NFS/nfs-1dd-1M-8p-2945M-20%25-2.6.38-rc6-dt6+-2011-02-22-21-09/balance_dirty_pages-pages.png due to bursty arrival of commits: http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v6/NFS/nfs-1dd-1M-8p-2945M-20%25-2.6.38-rc6-dt6+-2011-02-22-21-09/nfs-commit.png--- linux-next.orig/fs/nfs/write.c 2011-09-07 21:29:15.000000000 +0800 +++ linux-next/fs/nfs/write.c 2011-09-07 21:29:32.000000000 +0800@@ -1543,10 +1543,14 @@ static int nfs_commit_unstable_pages(str int ret = 0; if (wbc->sync_mode == WB_SYNC_NONE) { + unsigned long bw = MIN_WRITEBACK_PAGES + + NFS_SERVER(inode)->backing_dev_info.avg_write_bandwidth; + /* Don't commit yet if this is a non-blocking flush and there - * are a lot of outstanding writes for this mapping. + * are a lot of outstanding writes for this mapping, until + * collected enough pages to commit. */ - if (nfsi->ncommit <= (nfsi->npages >> 1)) + if (nfsi->ncommit < min(nfsi->npages / DIRTY_SCOPE, bw)) goto out_mark_dirty; /* don't wait for the COMMIT response */
So what goes into the 'avg_write_bandwidth' variable that makes it a good measure above (why 1 second of data instead of 10 seconds or 1ms, ...)? What is the 'DIRTY_SCOPE' value? IOW: what new black magic are we introducing above and why is it so obviously better than what we have (yes, I see you have graphs, but that is just measuring _one_ NFS setup and workload). -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>