Thread (6 messages) 6 messages, 2 authors, 2021-02-05

Re: [PATCH] fs/btrfs: Fix raid6 qstripe kmap'ing

From: David Sterba <hidden>
Date: 2021-02-04 15:29:35
Also in: linux-fsdevel

On Wed, Feb 03, 2021 at 04:56:48PM +0100, David Sterba wrote:
On Wed, Jan 27, 2021 at 10:15:03PM -0800, ira.weiny@intel.com wrote:
quoted
From: Ira Weiny <redacted>

When a qstripe is required an extra page is allocated and mapped.  There
were 3 problems.

1) There is no reason to map the qstripe page more than 1 time if the
   number of bits set in rbio->dbitmap is greater than one.
2) There is no reason to map the parity page and unmap it each time
   through the loop.
3) There is no corresponding call of kunmap() for the qstripe page.

The page memory can continue to be reused with a single mapping on each
iteration by raid6_call.gen_syndrome() without remapping.  So map the
page for the duration of the loop.

Similarly, improve the algorithm by mapping the parity page just 1 time.

Fixes: 5a6ac9eacb49 ("Btrfs, raid56: support parity scrub on raid56")
To: Chris Mason <clm@fb.com>
To: Josef Bacik <josef@toxicpanda.com>
To: David Sterba <dsterba@suse.com>
Cc: Miao Xie <redacted>
Signed-off-by: Ira Weiny <redacted>

---
This was found while replacing kmap() with kmap_local_page().  After
this patch unwinding all the mappings becomes pretty straight forward.

I'm not exactly sure I've worded this commit message intelligently.
Please forgive me if there is a better way to word it.
Changelog is good, thanks. I've added stable tags as the missing unmap
is a potential problem.
There are lots of tests faling, stack traces like below. I haven't seen
anything obvious in the patch so that needs a closer look and for the
time being I can't add the patch to for-next.

 BUG: kernel NULL pointer dereference, address:0000000000000000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] PREEMPT SMP
 CPU: 2 PID: 17173 Comm: kworker/u8:5 Not tainted5.11.0-rc6-default+ #1422
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-rmw btrfs_work_helper [btrfs]
 RIP: 0010:raid6_avx22_gen_syndrome+0x103/0x140 [raid6_pq]
 RSP: 0018:ffffa090042cfcf8 EFLAGS: 00010246
 RAX: ffff9e98e1848e80 RBX: ffff9e98d5849000 RCX:0000000000000020
 RDX: ffff9e98e32be000 RSI: 0000000000000000 RDI:ffff9e98e1848e80
 RBP: 0000000000000000 R08: 0000000000000000 R09:0000000000000001
 R10: ffff9e98e1848e90 R11: ffff9e98e1848e98 R12:0000000000001000
 R13: ffff9e98e1848e88 R14: 0000000000000005 R15:0000000000000002
 FS:  0000000000000000(0000) GS:ffff9e993da00000(0000)knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000000023143003 CR4:0000000000170ea0
 Call Trace:
  finish_parity_scrub+0x47b/0x7a0 [btrfs]
  raid56_parity_scrub_stripe+0x24e/0x260 [btrfs]
  btrfs_work_helper+0xd5/0x1d0 [btrfs]
  process_one_work+0x262/0x5f0
  worker_thread+0x4e/0x300
  ? process_one_work+0x5f0/0x5f0
  kthread+0x151/0x170
  ? __kthread_bind_mask+0x60/0x60
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help