Thread (31 messages) 31 messages, 4 authors, 2023-01-23

Re: [PATCH 2/2] nfsd: clean up potential nfsd_file refcount leaks in COPY codepath

From: Chuck Lever III <chuck.lever@oracle.com>
Date: 2023-01-18 16:40:18

On Jan 18, 2023, at 11:29 AM, Olga Kornievskaia [off-list ref] wrote:

On Wed, Jan 18, 2023 at 10:27 AM Jeff Layton [off-list ref] wrote:
quoted
On Wed, 2023-01-18 at 09:42 -0500, Olga Kornievskaia wrote:
quoted
On Tue, Jan 17, 2023 at 2:38 PM Jeff Layton [off-list ref] wrote:
quoted
There are two different flavors of the nfsd4_copy struct. One is
embedded in the compound and is used directly in synchronous copies. The
other is dynamically allocated, refcounted and tracked in the client
struture. For the embedded one, the cleanup just involves releasing any
nfsd_files held on its behalf. For the async one, the cleanup is a bit
more involved, and we need to dequeue it from lists, unhash it, etc.

There is at least one potential refcount leak in this code now. If the
kthread_create call fails, then both the src and dst nfsd_files in the
original nfsd4_copy object are leaked.
I don't believe that's true. If kthread_create thread fails we call
cleanup_async_copy() that does a put on the file descriptors.
You mean this?

out_err:
       if (async_copy)
               cleanup_async_copy(async_copy);

That puts the references that were taken in dup_copy_fields, but the
original (embedded) nfsd4_copy also holds references and those are not
being put in this codepath.
Can you please point out where do we take a reference on the original copy?
quoted
quoted
quoted
The cleanup in this codepath is also sort of weird. In the async copy
case, we'll have up to four nfsd_file references (src and dst for both
flavors of copy structure).
That's not true. There is a careful distinction between intra -- which
had 2 valid file pointers and does a get on both as they both point to
something that's opened on this server--- but inter -- only does a get
on the dst file descriptor, the src doesn't exit. And yes I realize
the code checks for nfs_src being null which it should be but it makes
the code less clear and at some point somebody might want to decide to
really do a put on it.
This is part of the problem here. We have a nfsd4_copy structure, and
depending on what has been done to it, you need to call different
methods to clean it up. That seems like a real antipattern to me.
But they call different methods because different things need to be
done there and it makes it clear what needs to be for what type of
copy.
In cases like this, it makes sense to consider using types to
ensure the code can't do the wrong thing. So you might want to
have a struct nfs4_copy_A for the inter code to use, and a struct
nfs4_copy_B for the intra code to use. Sharing the same struct
for both use cases is probably what's confusing to human readers.

I've never been a stickler for removing every last ounce of code
duplication. Here, it might help to have a little duplication
just to make it easier to reason about the reference counting in
the two use cases.

That's my view from the mountain top, worth every penny you paid
for it.

quoted
quoted
quoted
They are both put at the end of
nfsd4_do_async_copy, even though the ones held on behalf of the embedded
one outlive that structure.

Change it so that we always clean up the nfsd_file refs held by the
embedded copy structure before nfsd4_copy returns. Rework
cleanup_async_copy to handle both inter and intra copies. Eliminate
nfsd4_cleanup_intra_ssc since it now becomes a no-op.
I feel by combining the cleanup for both it obscures a very important
destication that src filehandle doesn't exist for inter.
If the src filehandle doesn't exist, then the pointer to it will be
NULL. I don't see what we gain by keeping these two distinct, other than
avoiding a NULL pointer check.
My reason would be for code clarity because different things are
supposed to happen for intra and inter. Difference of opinion it
seems.
--
Chuck Lever


Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help