[bug report] block/005 hangs with NVMe device and linux-block/for-next
From: Shinichiro Kawasaki <hidden>
Date: 2021-11-01 08:34:31
I tried the latest linux-block/for-next branch tip (git hash b43fadb6631f and
observed a process hang during blktests block/005 run on a NVMe device.
Kernel message reported "INFO: task check:1224 blocked for more than 122
seconds." with call trace [1]. So far, the hang is 100% reproducible with my
system. This hang is not observed with HDDs or null_blk devices.
I bisected and found the commit 4f5022453acd ("nvme: wire up completion batching
for the IRQ path") triggers the hang. When I revert this commit from the
for-next branch tip, the hang disappears. The block/005 test case does IO
scheduler switch during IO, and the completion path change by the commit looks
affecting the scheduler switch. Comments for solution will be appreciated.
[1]
[ 121.677521] run blktests block/005 at 2021-11-01 16:24:21
[ 371.819391] INFO: task check:1224 blocked for more than 122 seconds.
[ 371.826660] Not tainted 5.15.0-rc6+ #41
[ 371.831786] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 371.840349] task:check state:D stack: 0 pid: 1224 ppid: 1223 flags:0x00000000
[ 371.849471] Call Trace:
[ 371.852680] __schedule+0x9e2/0x2240
[ 371.857052] ? io_schedule_timeout+0x190/0x190
[ 371.862330] ? _raw_spin_unlock_irqrestore+0x37/0x40
[ 371.868070] schedule+0xdd/0x280
[ 371.872062] blk_mq_freeze_queue_wait+0xc0/0xf0
[ 371.877351] ? blk_mq_queue_inflight+0x70/0x70
[ 371.882554] ? blk_mq_run_hw_queues+0x13b/0x410
[ 371.887856] ? finish_wait+0x270/0x270
[ 371.892363] elevator_switch+0x4a/0xa0
[ 371.896857] elv_iosched_store+0x16e/0x3c0
[ 371.901698] ? elevator_init_mq+0x3f0/0x3f0
[ 371.906643] ? lock_is_held_type+0xe0/0x110
[ 371.911556] ? lock_is_held_type+0xe0/0x110
[ 371.916473] queue_attr_store+0x8b/0xd0
[ 371.921050] ? sysfs_file_ops+0x170/0x170
[ 371.925789] kernfs_fop_write_iter+0x2c7/0x460
[ 371.930965] new_sync_write+0x359/0x5e0
[ 371.935533] ? new_sync_read+0x5d0/0x5d0
[ 371.940177] ? ksys_write+0xe9/0x1b0
[ 371.944483] ? lock_release+0x690/0x690
[ 371.949046] ? __cond_resched+0x15/0x30
[ 371.953604] ? inode_security+0x56/0xf0
[ 371.958171] ? lock_is_held_type+0xe0/0x110
[ 371.963075] vfs_write+0x5e4/0x8e0
[ 371.967196] ksys_write+0xe9/0x1b0
[ 371.971321] ? __ia32_sys_read+0xa0/0xa0
[ 371.975962] ? syscall_enter_from_user_mode+0x21/0x70
[ 371.981733] do_syscall_64+0x3b/0x90
[ 371.986028] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 371.991799] RIP: 0033:0x7f24adda7387
[ 371.996085] RSP: 002b:00007ffd9c80f2e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 372.004366] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f24adda7387
[ 372.012213] RDX: 0000000000000005 RSI: 0000562156826120 RDI: 0000000000000001
[ 372.020060] RBP: 0000562156826120 R08: 000000000000000a R09: 00007f24ade3d4e0
[ 372.027901] R10: 00007f24ade3d3e0 R11: 0000000000000246 R12: 0000000000000005
[ 372.035746] R13: 00007f24ade7a520 R14: 0000000000000005 R15: 00007f24ade7a700
[ 372.043629]
Showing all locks held in the system:
[ 372.051235] 1 lock held by khungtaskd/62:
[ 372.055960] #0: ffffffffa64483c0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x269
[ 372.065649] 1 lock held by systemd-journal/584:
[ 372.070902] 4 locks held by check/1224:
[ 372.075452] #0: ffff888118fc4460 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xe9/0x1b0
[ 372.084177] #1: ffff888120717088 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x216/0x460
[ 372.093771] #2: ffff8881480a0a00 (kn->active#131){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x239/0x460
[ 372.103707] #3: ffff888148d66858 (&q->sysfs_lock){+.+.}-{3:3}, at: queue_attr_store+0x5d/0xd0
[ 372.115258] =============================================
--
Best Regards,
Shin'ichiro Kawasaki