Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
From: Yu Kuai <hidden>
Date: 2023-08-07 02:46:13
Subsystem:
software raid (multiple disks) support, the rest · Maintainers:
Song Liu, Yu Kuai, Linus Torvalds
Hi, 在 2023/08/07 10:15, Yu Kuai 写道:
Hi, 在 2023/08/07 10:09, Corey Hickey 写道:quoted
On 2023-08-06 18:02, Yu Kuai wrote:quoted
quoted
Here are the errors reported by the kernel: -------------------------------------------------------------------- [ 2566.222104] BUG: kernel NULL pointer dereference, address: 0000000000000157 [ 2566.222111] #PF: supervisor read access in kernel mode [ 2566.222114] #PF: error_code(0x0000) - not-present page [ 2566.222117] PGD 0 P4D 0 [ 2566.222121] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 2566.222125] CPU: 1 PID: 5415 Comm: md10_raid5 Not tainted 6.4.8 #3 [ 2566.222129] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII HERO (WI-FI), BIOS 4603 09/13/2021 [ 2566.222132] RIP: 0010:submit_bio_noacct+0x182/0x5c0Can you provide addr2line result? This will be helpful to locate the problem.I have not done this before; I struggled a bit until I found this: https://lwn.net/Articles/592724/ These are run within the kernel source tree, which I have not modified since the original compilation. $ scripts/decode_stacktrace.sh vmlinux < /tmp/trace1 [ 2566.222171] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) [ 2566.222176] ? page_fault_oops (arch/x86/mm/fault.c:707) [ 2566.222180] ? update_load_avg (kernel/sched/fair.c:3920 kernel/sched/fair.c:4255) [ 2566.222185] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:695 arch/x86/mm/fault.c:1494 arch/x86/mm/fault.c:1542) [ 2566.222190] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570) [ 2566.222196] ? submit_bio_noacct (block/blk-throttle.h:198 block/blk-throttle.h:210 block/blk-core.c:800) [ 2566.222201] handle_active_stripes.isra.0 (drivers/md/raid5.c:6709 (discriminator 1)) raid456
I'm not sure yet where is this io come from, however, based on your
test, I think this is from
raid5d
handle_active_stripes
r5l_flush_stripe_to_raid
submit_bio
And I found a problem after a quick look here:
t1: submit flush io
raid5d
handle_active_stripes
r5l_flush_stripe_to_raid
bio_init
submit_bio
// io1
t2: io1 is done
r5l_log_flush_endio
list_splice_tail_init
// new flush io can be dispatched
t3: submit new flush io
...
r5l_flush_stripe_to_raid
bio_init
bio_uninit
// clear bio->bi_blkg
submit_bio
// null-ptr-deref
This is definitly a problem, however, I'm not sure if this is your case,
can you test the following patch?
diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 51a68fbc241c..a85ea19fcf14 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c@@ -1266,9 +1266,8 @@ static void r5l_log_flush_endio(struct bio *bio) list_for_each_entry(io, &log->flushing_ios, log_sibling) r5l_io_run_stripes(io); list_splice_tail_init(&log->flushing_ios, &log->finished_ios); - spin_unlock_irqrestore(&log->io_list_lock, flags); - bio_uninit(bio); + spin_unlock_irqrestore(&log->io_list_lock, flags); } /*
Thanks, Kuai
quoted
[ 2566.222220] raid5d (drivers/md/raid5.c:6821) raid456 [ 2566.222234] ? __schedule (kernel/sched/core.c:6677) [ 2566.222240] ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:186 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4)) [ 2566.222245] ? preempt_count_add (./include/linux/ftrace.h:976 kernel/sched/core.c:5793 kernel/sched/core.c:5790 kernel/sched/core.c:5818) [ 2566.222248] ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:186 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4)) [ 2566.222254] ? __pfx_md_thread (drivers/md/md.c:7862) md_mod [ 2566.222273] md_thread (drivers/md/md.c:7898) md_mod [ 2566.222293] ? __pfx_autoremove_wake_function (kernel/sched/wait.c:418) [ 2566.222299] kthread (kernel/kthread.c:379) [ 2566.222304] ? __pfx_kthread (kernel/kthread.c:332) [ 2566.222309] ret_from_fork (arch/x86/entry/entry_64.S:314) $ scripts/decode_stacktrace.sh vmlinux < /tmp/trace2 [ 2566.436288] ? do_exit (kernel/exit.c:818 (discriminator 1)) [ 2566.436292] ? __warn (kernel/panic.c:673) [ 2566.436298] ? do_exit (kernel/exit.c:818 (discriminator 1)) [ 2566.436301] ? report_bug (lib/bug.c:180 lib/bug.c:219) [ 2566.436308] ? handle_bug (arch/x86/kernel/traps.c:303) [ 2566.436312] ? exc_invalid_op (arch/x86/kernel/traps.c:345 (discriminator 1)) [ 2566.436316] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) [ 2566.436321] ? do_exit (kernel/exit.c:818 (discriminator 1)) [ 2566.436325] ? do_exit (kernel/exit.c:818 (discriminator 1)) [ 2566.436329] make_task_dead (kernel/exit.c:972) [ 2566.436333] rewind_stack_and_make_dead (??:?) Is that what you are looking for?Yes, and can you provide witch commit are you testing? Thanks, Kuaiquoted
Thanks, Corey ..