RE: [PATCH 1/2] btrfs: reada: limit max works count
From: Zhao Lei <hidden>
Date: 2016-01-22 12:26:19
Hi, Chris Mason
-----Original Message----- From: Chris Mason [mailto:clm@fb.com] Sent: Thursday, January 21, 2016 10:15 PM To: Zhao Lei <redacted> Cc: linux-btrfs@vger.kernel.org Subject: Re: [PATCH 1/2] btrfs: reada: limit max works count On Thu, Jan 21, 2016 at 06:06:21PM +0800, Zhao Lei wrote:quoted
Hi, Chris Masonquoted
-----Original Message----- From: Chris Mason [mailto:clm@fb.com] Sent: Thursday, January 21, 2016 1:48 AM To: Zhao Lei <redacted>; linux-btrfs@vger.kernel.org Subject: Re: [PATCH 1/2] btrfs: reada: limit max works count On Wed, Jan 20, 2016 at 10:16:27AM -0500, Chris Mason wrote:quoted
On Tue, Jan 12, 2016 at 03:46:26PM +0800, Zhao Lei wrote:quoted
reada create 2 works for each level of tree in recursion. In case of a tree having many levels, the number of created works is 2^level_of_tree. Actually we don't need so many works in parallel, this patch limit max works to BTRFS_MAX_MIRRORS * 2.Hi, I don't think you end up calling atomic_dec() for every time that reada_start_machine() is called. Also, I'd rather not have a global static variable to limit the parallel workers, when we have more than one FS mounted it'll end up limiting things too much. With this patch applied, I'm seeing deadlocks during btrfs/066. You have to run the scrub tests as well, basically we're just getting fsstress run alongside scrub. I'll run a few more times with it reverted to make sure, but I think it's the root cause.I spoke too soon, it ended up deadlocking a few tests later.In logic, even if the calculation of atomic_dec() in this patch having bug, in worst condition, reada will works in single-thread mode, and will not introduce deadlock. And by looking the backtrace in this mail, maybe it is caused by reada_control->elems in someplace of this patchset. I recheck xfstests/066 in both vm and physical machine, on top of my pull-request git today, with btrfs-progs 4.4 for many times, but had nottriggered the bug. Just running 066 alone doesn't trigger it for me. I have to run everything from 00->066. My setup is 5 drives. I use a script to carve them up into logical volumes, 5 for the test device and 5 for the scratch pool. I think it should reproduce with a single drive, if you still can't trigger I'll confirm that.quoted
Could you tell me your test environment(TEST_DEV size, mount option), and odds of fails in btrfs/066?100% odds of failing, one time it made it up to btrfs/072. I think more important than the drive setup is that I have all the debugging on. CONFIG_DEBUG_PAGEALLOC, spinlock debugging, mutex debugging and lock dep enabled.
Thanks for your answer. But unfortunately I hadn't reproduce the dead_lock in above way today... Now I queued loop of above reproduce script in more nodes, and hopes it can happen in this weekend. And by reviewing code, I found a problem which can introduce similar bad result in logic, and made a patch for it. [PATCH] [RFC] btrfs: reada: avoid undone reada extents in btrfs_reada_wait Because it is only a problem in logic, but rarely happened, I only confirmed no-problem after patch applied. Sorry for increased your works, could you apply this patch and test is it works? Thanks Zhaolei
-chris