Re: bisected: btrfs dedupe regression in v5.11-rc1: 3078d85c9a10 vfs: verify source area in vfs_dedupe_file_range_one()
From: Nikolay Borisov <hidden>
Date: 2021-12-14 11:11:27
On 14.12.21 г. 1:12, Zygo Blaxell wrote:
On Mon, Dec 13, 2021 at 03:28:26PM +0200, Nikolay Borisov wrote:quoted
On 10.12.21 г. 20:34, Zygo Blaxell wrote:quoted
I've been getting deadlocks in dedupe on btrfs since kernel 5.11, and some bees users have reported it as well. I bisected to this commit: 3078d85c9a10 vfs: verify source area in vfs_dedupe_file_range_one() These kernels work for at least 18 hours: 5.10.83 (months) 5.11.22 with 3078d85c9a10 reverted (36 hours) btrfs misc-next 66dc4de326b0 with 3078d85c9a10 reverted These kernels lock up in 3 hours or less: 5.11.22 5.12.19 5.14.21 5.15.6 btrfs for-next 279373dee83e All of the failing kernels include this commit, none of the non-failing kernels include the commit. Kernel logs from the lockup: [19647.696042][ T3721] sysrq: Show Blocked State [19647.697024][ T3721] task:btrfs-transacti state:D stack: 0 pid: 6161 ppid: 2 flags:0x00004000 [19647.698203][ T3721] Call Trace: [19647.698608][ T3721] __schedule+0x388/0xaf0 [19647.699125][ T3721] schedule+0x68/0xe0 [19647.699615][ T3721] btrfs_commit_transaction+0x97c/0xbf0Can you run this through symbolize script as I'd like to understand where in transaction commit the sleep is happening.btrfs_commit_transaction+0x97c/0xbf0: btrfs_commit_transaction at fs/btrfs/transaction.c:2159 (discriminator 9) 2154 2155 ret = btrfs_run_delayed_items(trans); 2156 if (ret) 2157 goto cleanup_transaction; 2158 >2159< wait_event(cur_trans->writer_wait, 2160 extwriter_counter_read(cur_trans) == 0); 2161 2162 /* some pending stuffs might be added after the previous flush. */ 2163 ret = btrfs_run_delayed_items(trans); 2164 if (ret)
So it seems there is an open transaction handle thus commit can't continue and everything is stalled behind. Would you be able to run the attached python script on a host which is stuck. It requires you having debug symbols for the kernel installed as well as https://github.com/osandov/drgn/ which is a scriptable debugger. The easiest way would to follow the instructions at https://drgn.readthedocs.io/en/latest/installation.html and just get it via pip. Once you have it installed run it by doing: "sudo drgn get-num-extwriters.py 310dd372-0fd1-4496-a232-0fb46ca4afd6" Where 310dd372-0fd1-4496-a232-0fb46ca4afd6 is the fsid as taken from 'blkid' which corresponds to the wedged fs. <snip>
Attachments
- get-num-extwriters.py [text/x-python] 738 bytes · preview