Thread (11 messages) 11 messages, 4 authors, 2025-12-30

Re: [syzbot] [net?] INFO: task hung in new_device_store (5)

From: Hillf Danton <hidden>
Date: 2024-09-27 11:07:00
Also in: lkml

On Thu, 26 Sep 2024 22:14:14 +0200 Eric Dumazet [off-list ref]
quoted hunk ↗ jump to hunk
On Thu, Sep 26, 2024 at 7:58 PM syzbot wrote:
quoted
Hello,

syzbot found the following issue on:

HEAD commit:    97d8894b6f4c Merge tag 'riscv-for-linus-6.12-mw1' of git:/..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12416a27980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=bc30a30374b0753
dashboard link: https://syzkaller.appspot.com/bug?extid=05f9cecd28e356241aba
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/bd119f4fdc08/disk-97d8894b.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/4d0bfed66f93/vmlinux-97d8894b.xz
kernel image: https://storage.googleapis.com/syzbot-assets/0f9223ac9bfb/bzImage-97d8894b.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+05f9cecd28e356241aba@syzkaller.appspotmail.com

INFO: task syz-executor:9916 blocked for more than 143 seconds.
      Not tainted 6.11.0-syzkaller-10045-g97d8894b6f4c #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor    state:D stack:21104 pid:9916  tgid:9916  ppid:1      flags:0x00000004
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5315 [inline]
 __schedule+0x1895/0x4b30 kernel/sched/core.c:6674
 __schedule_loop kernel/sched/core.c:6751 [inline]
 schedule+0x14b/0x320 kernel/sched/core.c:6766
 schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6823
 __mutex_lock_common kernel/locking/mutex.c:684 [inline]
 __mutex_lock+0x6a7/0xd70 kernel/locking/mutex.c:752
 new_device_store+0x1b4/0x890 :166
 kernfs_fop_write_iter+0x3a2/0x500 fs/kernfs/file.c:334
 new_sync_write fs/read_write.c:590 [inline]
 vfs_write+0xa6f/0xc90 fs/read_write.c:683
 ksys_write+0x183/0x2b0 fs/read_write.c:736
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f8310d7c9df
RSP: 002b:00007ffe830a52e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f8310d7c9df
RDX: 0000000000000003 RSI: 00007ffe830a5330 RDI: 0000000000000005
RBP: 00007f8310df1c39 R08: 0000000000000000 R09: 00007ffe830a5137
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000003
R13: 00007ffe830a5330 R14: 00007f8311a64620 R15: 0000000000000003
 </TASK>
typical sysfs deadlock ?
diff --git a/drivers/net/netdevsim/bus.c b/drivers/net/netdevsim/bus.c
index 64c0cdd31bf85468ce4fa2b2af5c8aff4cfba897..3bf0ce52d71653fd9b8c752d52d0b5b7e19042d8
100644
--- a/drivers/net/netdevsim/bus.c
+++ b/drivers/net/netdevsim/bus.c
@@ -163,7 +163,9 @@ new_device_store(const struct bus_type *bus, const
char *buf, size_t count)
                return -EINVAL;
        }

-       mutex_lock(&nsim_bus_dev_list_lock);
+       if (!mutex_trylock(&nsim_bus_dev_list_lock))
+               return restart_syscall();
+
        /* Prevent to use resource before initialization. */
        if (!smp_load_acquire(&nsim_bus_enable)) {
                err = -EBUSY;

quoted
Showing all locks held in the system:
...
quoted
4 locks held by syz-executor/9916:
 #0: ffff88807ca86420 (sb_writers#8){.+.+}-{0:0}, at: file_start_write include/linux/fs.h:2930 [inline]
 #0: ffff88807ca86420 (sb_writers#8){.+.+}-{0:0}, at: vfs_write+0x224/0xc90 fs/read_write.c:679
 #1: ffff88802e71e488 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x1ea/0x500 fs/kernfs/file.c:325
 #2: ffff888144ff5968 (kn->active#50){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x20e/0x500 fs/kernfs/file.c:326
 #3: ffffffff8f56d3e8 (nsim_bus_dev_list_lock){+.+.}-{3:3}, at: new_device_store+0x1b4/0x890 drivers/net/netdevsim/bus.c:166
syz-executor/9916 is lock waiter, and
quoted
7 locks held by syz-executor/9976:
 #0: ffff88807ca86420 (sb_writers#8){.+.+}-{0:0}, at: file_start_write include/linux/fs.h:2930 [inline]
 #0: ffff88807ca86420 (sb_writers#8){.+.+}-{0:0}, at: vfs_write+0x224/0xc90 fs/read_write.c:679
 #1: ffff88807abc2888 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x1ea/0x500 fs/kernfs/file.c:325
 #2: ffff888144ff5a58 (kn->active#49){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x20e/0x500 fs/kernfs/file.c:326
 #3: ffffffff8f56d3e8 (nsim_bus_dev_list_lock){+.+.}-{3:3}, at: del_device_store+0xfc/0x480 drivers/net/netdevsim/bus.c:216
 #4: ffff888060f5a0e8 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:1014 [inline]
 #4: ffff888060f5a0e8 (&dev->mutex){....}-{3:3}, at: __device_driver_lock drivers/base/dd.c:1095 [inline]
 #4: ffff888060f5a0e8 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0xce/0x7c0 drivers/base/dd.c:1293
 #5: ffff888060f5b250 (&devlink->lock_key#40){+.+.}-{3:3}, at: nsim_drv_remove+0x50/0x160 drivers/net/netdevsim/dev.c:1672
 #6: ffffffff8fccdc48 (rtnl_mutex){+.+.}-{3:3}, at: nsim_destroy+0x71/0x5c0 drivers/net/netdevsim/netdev.c:773
syz-executor/9976 is lock owner. Given both waiter and owner printed,
the proposed trylock looks like the typical paperover at least from a
hoofed skull because of no real deadlock detected.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help