Thread (53 messages) 53 messages, 4 authors, 2022-01-04

Re: [PATCH RFC v5 0/2] nfsd: Initial implementation of NFSv4 Courteous Server

From: Bruce Fields <hidden>
Date: 2021-12-03 21:22:02
Also in: linux-fsdevel

On Wed, Dec 01, 2021 at 02:50:50PM -0500, Bruce Fields wrote:
On Wed, Dec 01, 2021 at 01:03:39PM -0500, Bruce Fields wrote:
quoted
On Wed, Dec 01, 2021 at 12:42:05PM -0500, Bruce Fields wrote:
quoted
On Wed, Dec 01, 2021 at 09:36:30AM -0500, Bruce Fields wrote:
quoted
OK, good to know.  It'd be interesting to dig into where nfsdcltrack is
spending its time, which we could do by replacing it with a wrapper that
runs the real nfsdcltrack under strace.

Though maybe it'd be better to do this on a system using nfsdcld, since
that's what we're transitioning to.
Trying that on a test VM here, I see each upcall doing 3 fdatasyncs() of
an sqlite-journal file.  On my setup, each of those is taking a few
milliseconds.  I wonder if it an do better.
If I understand the sqlite documentation correctly, I *think* that if we
use journal_mode WAL with synchronous FULL, we should get the assurances
nfsd needs with one sync per transaction.
So I *think* that would mean just doing something like (untested, don't have
much idea what I'm doing):
OK, tried that out on my test VM, and: yes, the resulting strace was
much simpler (and, in particular, had only one fdatasync per upcall
instead of 3), and total time to expire 1000 courtesy clients was 6.5
seconds instead of 15.9.  So, I'll clean up that patch and pass it along
to Steve D.

This is all a bit of a derail, I know, but I suspect this will be a
bottleneck in other cases too, like when a lot of clients are reclaiming
after reboot.

We do need nfsdcld to sync to disk before returning to the kernel, so
this probably can't be further optimized without doing something more
complicated to allow some kind of parallelism and batching.

So if you have a ton of clients you'll just need /var/lib/nfs to be on
low-latency storage.

--b.
quoted hunk ↗ jump to hunk
diff --git a/utils/nfsdcld/sqlite.c b/utils/nfsdcld/sqlite.c
index 03016fb95823..b30f2614497b 100644
--- a/utils/nfsdcld/sqlite.c
+++ b/utils/nfsdcld/sqlite.c
@@ -826,6 +826,13 @@ sqlite_prepare_dbh(const char *topdir)
                goto out_close;
        }
 
+       ret = sqlite3_exec(dbh, "PRAGMA journal_mode = WAL;", NULL, NULL, NULL);
+       if (ret)
+               goto out_close;
+       ret = sqlite3_exec(dbh, "PRAGMA synchronous = FULL;", NULL, NULL, NULL);
+       if (ret)
+               goto out_close;
+
        ret = sqlite_query_schema_version();
        switch (ret) {
        case CLD_SQLITE_LATEST_SCHEMA_VERSION:
I also wonder how expensive may be the extra overhead of starting up
nfsdcltrack each time.

--b.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help