Re: Host managed SMR drive issue

Host managed SMR drive issue · Sven Oehme <hidden> · 2021-09-24
Re: Host managed SMR drive issue · Damien Le Moal <hidden> · 2021-09-26
Re: Host managed SMR drive issue · Sven Oehme <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Sven Oehme <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Sven Oehme <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Damien Le Moal <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Sven Oehme <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Damien Le Moal <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Sven Oehme <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Damien Le Moal <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Sven Oehme <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Sven Oehme <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Damien Le Moal <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Sven Oehme <hidden> · 2021-09-27
Re: Host managed SMR drive issue · Naohiro Aota <Naohiro.Aota@wdc.com> · 2021-09-28
Re: Host managed SMR drive issue · Sven Oehme <hidden> · 2021-09-28
Re: Host managed SMR drive issue · Johannes Thumshirn <hidden> · 2021-09-30
Re: Host managed SMR drive issue · Johannes Thumshirn <hidden> · 2021-09-28
Re: Host managed SMR drive issue · Johannes Thumshirn <hidden> · 2021-09-29

From: Johannes Thumshirn <hidden>
Date: 2021-09-29 10:29:47

On 28/09/2021 08:36, Johannes Thumshirn wrote:

On 28/09/2021 01:34, Sven Oehme wrote:

quoted

the workload is as simple as one can imagine :

scp -rp /space/01/ /space/20/
where space/01 is a ext4 filesystem with several >100gb files in it,
/space/20 is the btrfs destination

its single threaded , nothing special

Thanks for the info, I'm trying to recreate the issue locally.

OK unfortunately I'm not getting anywhere with my tries to 
reproduce the issue. But I have a hypothesis what could be happening.

Can you do me a favor and try this:


echo 'r:myretprobe sd_zbc_prepare_zone_append $retval' >> /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/myretprobe/enable
echo 1 > /sys/kernel/debug/tracing/tracing_on

and then re-run your copy process. Once the hang occurs please dump 
the trace buffer 
cat /sys/kernel/debug/tracing/trace > /tmp/trace.txt 
so we can examine it.

I'm expecting we're seeing a lot of 13s as return value 
(BLK_STS_ZONE_RESOURCE), which would mean the zone append emulation
in the SCSI stack can't lock the zone for writing and re-queues the 
I/O to retry later. Which never really happens because.. 
I have no idea yet...

Fingers crossed my hypothesis is correct so we know where to 
start looking.

Thanks,
	Johannes

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help