Thread (5 messages) 5 messages, 2 authors, 2025-06-26
STALE353d

[PATCH 1/1] pNFS/flexfiles: mark device unavailable on fatal connection error

From: Tigran Mkrtchyan <hidden>
Date: 2025-06-09 21:52:50
Subsystem: filesystems (vfs and infrastructure), nfs, sunrpc, and lockd clients, the rest · Maintainers: Alexander Viro, Christian Brauner, Trond Myklebust, Anna Schumaker, Linus Torvalds

Fixes: 260f32adb88 ("pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect")

When an applications get killed (SIGTERM/SIGINT) while pNFS client performs a connection
to DS, client ends in an infinite loop of connect-disconnect. This
source of the issue, it that flexfilelayoutdev#nfs4_ff_layout_prepare_ds gets an error
on nfs4_pnfs_ds_connect with status ERESTARTSYS, which is set by rpc_signal_task, but
the error is treated as transient, thus retried.

The issue is reproducible with script as (there should be ~1000 files in
a directory, client should must not have any connections to DSes):
echo 3 > /proc/sys/vm/drop_caches

for i in *
do
        head -1 $i &
        PP=$!
        sleep 10e-03
        kill -TERM $PP
done
Signed-off-by: Tigran Mkrtchyan <redacted>
---
 fs/nfs/flexfilelayout/flexfilelayoutdev.c | 4 ++++
 1 file changed, 4 insertions(+)
diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
index 4a304cf17c4b..0008a8180c9b 100644
--- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c
+++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
@@ -410,6 +410,10 @@ nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg,
 			mirror->mirror_ds->ds_versions[0].wsize = max_payload;
 		goto out;
 	}
+	/* There is a fatal error to connect to DS. Mark it unavailable to avoid infinite retry loop. */
+	if (nfs_error_is_fatal(status))
+		nfs4_mark_deviceid_unavailable(&mirror->mirror_ds->id_node);
+
 noconnect:
 	ff_layout_track_ds_error(FF_LAYOUT_FROM_HDR(lseg->pls_layout),
 				 mirror, lseg->pls_range.offset,
-- 
2.49.0
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help