Re: [PATCH v4 3/7] NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection
From: NeilBrown <hidden>
Date: 2022-10-25 01:07:51
On Tue, 25 Oct 2022, Chuck Lever III wrote:
quoted
On Oct 24, 2022, at 12:57 PM, Jeff Layton [off-list ref] wrote: On Mon, 2022-10-24 at 13:33 +1100, NeilBrown wrote:quoted
On Wed, 19 Oct 2022, Chuck Lever wrote:quoted
+ nfsd_file_lru_add(nf); + else if (refcount_read(&nf->nf_ref) == 2) + nfsd_file_unhash_and_put(nf);Tests on the value of a refcount are almost always racy.Agreed, and there's a clear race above, now that I look more closely. If nf_ref is 3 and two puts are racing then neither of them will call nfsd_file_unhash_and_put. We really should be letting the outcome of the decrement drive things (like you say below).quoted
I suspect there is an implication that as NFSD_FILE_GC is not set, this *must* be hashed which implies there is guaranteed to be a refcount from the hashtable. So this is really a test to see if the pre-biased refcount is one. The safe way to test if a refcount is 1 is dec_and_test. i.e. linkage from the hash table should not count as a reference (in the not-GC case). Lookup in the hash table should fail if the found entry cannot achieve an inc_not_zero. When dec_and_test says the refcount is zero, we remove from the hash table (certain that no new references will be taken).This does seem a more sensible approach. That would go a long way toward simplifying nfsd_file_put.So I cut-and-pasted the approach you used in the patch you sent a few weeks ago. I don't object to replacing that... but I don't see exactly where you guys are going with this.
Where I'm going with this is to suggest that this ref-counting oddity is
possibly the cause of the occasional refcounting problems that you
have had with nfsd.
I think that the hash table should never own a reference to the
nfsd_file. If the refcount ever reaches zero, then it gets removed from
the hash table.
if (refcount_dec_and_test())
if (test_and_clear_bit(HASHED,...))
delete from hash table
Transient references are counted.
For NFSv2,3 existence in the LRU is counted (I don't think they are at present).
For NFSv4, references from nfs4_file are counted.
But presence in the hash table is only indicated by the HASHED flag.
I think this would make the ref counting more robust.
NeilBrown