[PATCH 0/1] pNFS/flexfiles: mark device unavailable on fatal connection error
From: Tigran Mkrtchyan <hidden>
Date: 2025-06-09 21:52:50
From: Tigran Mkrtchyan <hidden>
Date: 2025-06-09 21:52:50
As mentioned in the thread https://lore.kernel.org/linux-nfs/601285843.50695650.1748800817824.JavaMail.zimbra@desy.de/T/#u (local) We observe that interrupted batch processing jobs put the client into an unrecoverable state that requires the client host reboot. Finally, I was able to build a custom kernel with all required third-party drivers to prove my assumption. So indeed, marking pNFS device unavailable fixes the issue. Thus, please consider the proposed change and backport it to older kernels. I did testing with (which is not part of the patch) and will try to add a trace point as soon as I find out how to implement one. Tigran Mkrtchyan (1): pNFS/flexfiles: mark device unavailable on fatal connection error fs/nfs/flexfilelayout/flexfilelayoutdev.c | 4 ++++ 1 file changed, 4 insertions(+) -- 2.49.0