Re: copy_file_range() infinitely hangs on NFSv4.2 over RDMA
From: Timo Rothenpieler <hidden>
Date: 2021-02-16 20:39:07
Also in:
linux-nfs
I can't get a network (I assume just TCP/20049 is fine, and not also some RDMA trace?) right now, but I will once a user has finished their work on the machine. The stack of the stuck process looks as follows:
task:xfs_io state:S stack: 0 pid:841684 ppid:841677 flags:0x00004001 Call Trace: __schedule+0x3e9/0x660 ? rpc_task_release_transport+0x42/0x60 schedule+0x46/0xb0 schedule_timeout+0x20e/0x2a0 ? nfs4_call_sync_custom+0x23/0x30 wait_for_completion_interruptible+0x80/0x120 nfs42_proc_copy+0x505/0xb00 ? find_get_pages_range_tag+0x211/0x270 ? enqueue_task_fair+0xb5/0x500 ? __filemap_fdatawait_range+0x66/0xf0 nfs4_copy_file_range+0x198/0x240 vfs_copy_file_range+0x39a/0x470 ? ptrace_do_notify+0x82/0xb0 __x64_sys_copy_file_range+0xd6/0x210 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f8eead1e259 RSP: 002b:00007ffcb1bb5778 EFLAGS: 00000206 ORIG_RAX: 0000000000000146 RAX: ffffffffffffffda RBX: 00007ffcb1bb5790 RCX: 00007f8eead1e259 RDX: 0000000000000003 RSI: 00007ffcb1bb5790 RDI: 0000000000000004 RBP: 0000000020000000 R08: 0000000020000000 R09: 0000000000000000 R10: 00007ffcb1bb5798 R11: 0000000000000206 R12: 00007ffcb1bb5798 R13: 0000000000000004 R14: 0000000000000001 R15: 0000000000000000
On 16.02.2021 21:12, Olga Kornievskaia wrote:
Hi Timo, Can you get a network trace? Also, you say that the copy_file_range() (after what looks like a successful copy) never returns (and application hangs), can you get a sysrq output of what the process's stack (echo t > /proc/sysrq-trigger and see what gets dumped into the var log messages and locate your application and report what the stack says)? On Sat, Feb 13, 2021 at 10:41 PM Timo Rothenpieler [off-list ref] wrote:quoted
On our Fileserver, running a few weeks old 5.10, we are running into a weird issue with NFS 4.2 Server-Side Copy and RDMA (and ZFS, though I'm not sure how relevant that is to the issue). The servers are connected via InfiniBand, on a Mellanox ConnectX-4 card, using the mlx5 driver. Anything using the copy_file_range() syscall to copy stuff just hangs. In strace, the syscall never returns. Simple way to reproduce on the client: > xfs_io -fc "pwrite 0 1M" testfile > xfs_io -fc "copy_range testfile" testfile.copy The second call just never exits. It sits in S+ state, with no CPU usage, and can easily be killed via Ctrl+C. I let it sit for a couple hours as well, it does not seem to ever complete. Some more observations about it: If I do a fresh reboot of the client, the operation works fine for a short while (like, 10~15 minutes). No load is on the system during that time, it's effectively idle. The operation actually does successfully copy all data. The size and checksum of the target file is as expected. It just never returns. This only happens when mounting via RDMA. Mounting the same NFS share via plain TCP has the operation work reliably. Had this issue with Kernel 5.4 already, and had hoped that 5.10 might have fixed it, but unfortunately it didn't. I tried two server and 30 different client machines, they all exhibit the exact same behaviour. So I'd carefully rule out a hardware issue. Any pointers on how to debug or maybe even fix this? Thanks, Timo