Re: Host managed SMR drive issue
From: Johannes Thumshirn <hidden>
Date: 2021-09-29 10:29:47
On 28/09/2021 08:36, Johannes Thumshirn wrote:
On 28/09/2021 01:34, Sven Oehme wrote:quoted
the workload is as simple as one can imagine : scp -rp /space/01/ /space/20/ where space/01 is a ext4 filesystem with several >100gb files in it, /space/20 is the btrfs destination its single threaded , nothing specialThanks for the info, I'm trying to recreate the issue locally.
OK unfortunately I'm not getting anywhere with my tries to reproduce the issue. But I have a hypothesis what could be happening. Can you do me a favor and try this: echo 'r:myretprobe sd_zbc_prepare_zone_append $retval' >> /sys/kernel/debug/tracing/kprobe_events echo 1 > /sys/kernel/debug/tracing/events/kprobes/myretprobe/enable echo 1 > /sys/kernel/debug/tracing/tracing_on and then re-run your copy process. Once the hang occurs please dump the trace buffer cat /sys/kernel/debug/tracing/trace > /tmp/trace.txt so we can examine it. I'm expecting we're seeing a lot of 13s as return value (BLK_STS_ZONE_RESOURCE), which would mean the zone append emulation in the SCSI stack can't lock the zone for writing and re-queues the I/O to retry later. Which never really happens because.. I have no idea yet... Fingers crossed my hypothesis is correct so we know where to start looking. Thanks, Johannes