Re: [PATCH v3 2/4] nfsd: rework refcounting in filecache

[PATCH v3 0/4] nfsd: clean up refcounting in the filecache · Jeff Layton <jlayton@kernel.org> · 2022-10-28
[PATCH v3 3/4] nfsd: close race between unhashing and LRU addition · Jeff Layton <jlayton@kernel.org> · 2022-10-28
Re: [PATCH v3 3/4] nfsd: close race between unhashing and LRU addition · Chuck Lever III <chuck.lever@oracle.com> · 2022-10-28
Re: [PATCH v3 3/4] nfsd: close race between unhashing and LRU addition · Jeff Layton <jlayton@kernel.org> · 2022-10-28
Re: [PATCH v3 3/4] nfsd: close race between unhashing and LRU addition · NeilBrown <hidden> · 2022-10-30
Re: [PATCH v3 3/4] nfsd: close race between unhashing and LRU addition · Chuck Lever III <chuck.lever@oracle.com> · 2022-10-31
Re: [PATCH v3 3/4] nfsd: close race between unhashing and LRU addition · Jeff Layton <jlayton@kernel.org> · 2022-10-31
Re: [PATCH v3 3/4] nfsd: close race between unhashing and LRU addition · Chuck Lever III <chuck.lever@oracle.com> · 2022-10-31
Re: [PATCH v3 3/4] nfsd: close race between unhashing and LRU addition · Jeff Layton <jlayton@kernel.org> · 2022-10-31
Re: [PATCH v3 3/4] nfsd: close race between unhashing and LRU addition · Jeff Layton <jlayton@kernel.org> · 2022-10-31
[PATCH v3 1/4] nfsd: remove the pages_flushed statistic from filecache · Jeff Layton <jlayton@kernel.org> · 2022-10-28
Re: [PATCH v3 1/4] nfsd: remove the pages_flushed statistic from filecache · Chuck Lever III <chuck.lever@oracle.com> · 2022-10-28
[PATCH v3 4/4] nfsd: start non-blocking writeback after adding nfsd_file to the LRU · Jeff Layton <jlayton@kernel.org> · 2022-10-28
Re: [PATCH v3 4/4] nfsd: start non-blocking writeback after adding nfsd_file to the LRU · Chuck Lever III <chuck.lever@oracle.com> · 2022-10-28
Re: [PATCH v3 4/4] nfsd: start non-blocking writeback after adding nfsd_file to the LRU · Jeff Layton <jlayton@kernel.org> · 2022-10-28
Re: [PATCH v3 4/4] nfsd: start non-blocking writeback after adding nfsd_file to the LRU · Chuck Lever III <chuck.lever@oracle.com> · 2022-10-28
Re: [PATCH v3 4/4] nfsd: start non-blocking writeback after adding nfsd_file to the LRU · Jeff Layton <jlayton@kernel.org> · 2022-10-31
[PATCH v3 2/4] nfsd: rework refcounting in filecache · Jeff Layton <jlayton@kernel.org> · 2022-10-28
Re: [PATCH v3 2/4] nfsd: rework refcounting in filecache · Chuck Lever III <chuck.lever@oracle.com> · 2022-10-28
Re: [PATCH v3 2/4] nfsd: rework refcounting in filecache · Jeff Layton <jlayton@kernel.org> · 2022-10-28
Re: [PATCH v3 2/4] nfsd: rework refcounting in filecache · Chuck Lever III <chuck.lever@oracle.com> · 2022-10-28
Re: [PATCH v3 2/4] nfsd: rework refcounting in filecache · Jeff Layton <jlayton@kernel.org> · 2022-10-28
Re: [PATCH v3 2/4] nfsd: rework refcounting in filecache · Chuck Lever III <chuck.lever@oracle.com> · 2022-10-28
Re: [PATCH v3 2/4] nfsd: rework refcounting in filecache · Jeff Layton <jlayton@kernel.org> · 2022-10-31
Re: [PATCH v3 2/4] nfsd: rework refcounting in filecache · Chuck Lever III <chuck.lever@oracle.com> · 2022-11-01
Re: [PATCH v3 2/4] nfsd: rework refcounting in filecache · Jeff Layton <jlayton@kernel.org> · 2022-11-01

From: Chuck Lever III <chuck.lever@oracle.com>
Date: 2022-11-01 13:58:25

On Oct 28, 2022, at 4:13 PM, Jeff Layton [off-list ref] wrote:

On Fri, 2022-10-28 at 19:49 +0000, Chuck Lever III wrote:

quoted

On Oct 28, 2022, at 2:57 PM, Jeff Layton [off-list ref] wrote:

quoted

I'm still not sold on the idea of a synchronous flush in nfsd_file_free().

I think that we need to call this there to ensure that writeback errors
are handled. I worry that if try to do this piecemeal, we could end up
missing errors when they fall off the LRU.

quoted

That feels like a deadlock waiting to happen and quite difficult to
reproduce because I/O there is rarely needed. It could help to put a
might_sleep() in nfsd_file_fsync(), at least, but I would prefer not to
drive I/O in that path at all.

I don't quite grok the potential for a deadlock here. nfsd_file_free
already has to deal with blocking activities due to it effective doing a
close(). This is just another one. That's why nfsd_file_put has a
might_sleep in it (to warn its callers).

What's the deadlock scenario you envision?

I never answered this question.

I'll say up front that I believe this problem exists in the current code
base, so what follows is meant to document an existing issue rather than
a problem with this patch series.

The filecache sets up a shrinker callback. This callback uses the same
or similar code paths as the filecache garbage collector.

Dai has found that trying to fsync inside a shrinker callback will
lead to deadlocks on some filesystems (notably I believe he was testing
btrfs at the time).

To address this, the filecache shrinker callback could avoid evicting
nfsd_file items that are open for write.


--
Chuck Lever

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help