Re: 3.4.4-rt13: btrfs + xfstests 006 = BOOM.. and a bonus rt_mutex deadlock report for absolutely free!
From: Chris Mason <hidden>
Date: 2012-07-16 15:43:04
Also in:
linux-fsdevel, lkml
On Mon, Jul 16, 2012 at 04:55:44AM -0600, Mike Galbraith wrote:
On Sat, 2012-07-14 at 12:14 +0200, Mike Galbraith wrote:quoted
On Fri, 2012-07-13 at 08:50 -0400, Chris Mason wrote:quoted
On Wed, Jul 11, 2012 at 11:47:40PM -0600, Mike Galbraith wrote:quoted
Greetings,[ deadlocks with btrfs and the recent RT kernels ] I talked with Thomas about this and I think the problem is the single-reader nature of the RW rwlocks. The lockdep report below mentions that btrfs is calling:quoted
[ 692.963099] [<ffffffff811fabd2>] btrfs_clear_path_blocking+0x32/0x70In this case, the task has a number of blocking read locks on the btrfs buffers, and we're trying to turn them back into spinning read locks. Even though btrfs is taking the read rwlock, it doesn't think of this as a new lock operation because we were blocking out new writers. If the second task has taken the spinning read lock, it is going to prevent that clear_path_blocking operation from progressing, even though it would have worked on a non-RT kernel. The solution should be to make the blocking read locks in btrfs honor the single-reader semantics. This means not allowing more than one blocking reader and not allowing a spinning reader when there is a blocking reader. Strictly speaking btrfs shouldn't need recursive readers on a single lock, so I wouldn't worry about that part. There is also a chunk of code in btrfs_clear_path_blocking that makes sure to strictly honor top down locking order during the conversion. It only does this when lockdep is enabled because in non-RT kernels we don't need to worry about it. For RT we'll want to enable that as well. I'll give this a shot later today.I took a poke at it. Did I do something similar to what you had in mind, or just hide behind performance stealing paranoid trylock loops? Box survived 1000 x xfstests 006 and dbench [-s] massive right off the bat, so it gets posted despite skepticism.Seems btrfs isn't entirely convinced either. [ 2292.336229] use_block_rsv: 1810 callbacks suppressed [ 2292.336231] ------------[ cut here ]------------ [ 2292.336255] WARNING: at fs/btrfs/extent-tree.c:6344 use_block_rsv+0x17d/0x190 [btrfs]() [ 2292.336257] Hardware name: System x3550 M3 -[7944K3G]- [ 2292.336259] btrfs: block rsv returned -28
This is unrelated. You got far enough into the benchmark to hit an ENOSPC warning. This can be ignored (I just deleted it when we used 3.0 for oracle). re: dbench performance. dbench tends to penalize fairness. I can imagine RT making it slower in general. It also triggers lots of lock contention in btrfs because the dataset is fairly small and the trees don't fan out a lot. -chris