Thread (45 messages) 45 messages, 6 authors, 2023-03-02

Re: Zombie / Orphan open files

From: Jeff Layton <jlayton@kernel.org>
Date: 2023-01-31 18:13:39

On Tue, 2023-01-31 at 16:34 +0000, Chuck Lever III wrote:
quoted
On Jan 31, 2023, at 9:42 AM, Andrew J. Romero [off-list ref] wrote:

In a large campus environment, usage of the relevant memory pool will eventually get so
high that a server-side reboot will be needed.
The above is sticking with me a bit.

Rebooting the server should force clients to re-establish state.

Are they not re-establishing open file state for users whose
ticket has expired? I would think each client would re-establish
state for those open files anyway, and the server would be in the
same overcommitted state it was in before it rebooted.

We might not have an accurate root cause analysis yet, or I could
be missing something.
My assumption was that the client wasn't able to get credentials to run
the CLOSE RPC in this case, so it can't properly send the call. That's a
big assumption though. It'd be good to confirm this.

It looks like the CLOSE codepath on the client calls nfs4_state_protect
with NFS_SP4_MACH_CRED_CLEANUP, and that should make it use the machine
cred? I'm not 100% clear here though...it looks like that may be
conditional on what was sent by the server in EXCHANGE_ID.

FWIW, I don't see any reason we shouldn't use the machine cred for the
close compound. Nothing we do in there should require permission
checking.

BTW: is this NFSv4.0 or v4.1+ (or a mix)?
-- 
Jeff Layton [off-list ref]
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help