Re: INFO: rcu detected stall in sys_kill
From: Daniel Axtens <hidden>
Date: 2019-12-17 13:39:02
Also in:
lkml
Daniel Axtens [off-list ref] writes:
Hi Casey,quoted
There haven't been Smack changes recently, so this is going to have been introduced elsewhere. I'm perfectly willing to accept that Smack is doing something horribly wrong WRT rcu, and that it needs repair, but its going to be tough for me to track down. I hope someone else is looking into this, as my chances of finding the problem are pretty slim.Yeah, I'm having a look, it's probably related to my kasan-vmalloc stuff. It's currently in a bit of flux as syzkaller finds a bunch of other bugs with it, once that stablises a bit I'll come back to Smack.
I have had a brief and wildly unsuccessful look at this. I'm happy to come back to it and go over it with a finer toothed comb, but it will almost certainly have to wait until next year. I don't think it's related to RCU, we also have a plain lockup: https://syzkaller.appspot.com/bug?id=be03729d17bb3b2df1754a7486a8f8628f6ff1ec Dmitry, I've been really struggling to repro this locally, even with your config. Is there an easy way to see the kernel command line you booted with and anything else that makes this image special? I have zero experience with smack so this is a steep learning curve. Regards, Daniel
Regards, Danielquoted
quoted
quoted
I see 2 common this across all stalls: 1. They all happen on the instance that uses smack (which is now effectively dead), see smack instance here: https://syzkaller.appspot.com/upstream 2. They all contain this frame in the stack trace: free_thread_stack+0x168/0x590 kernel/fork.c:280 The last commit that touches this file is "fork: support VMAP_STACK with KASAN_VMALLOC". That may be very likely the root cause. +DanielI've stopped smack syzbot instance b/c it produces infinite stream of assorted crashes due to this. Please ping syzkaller@googlegroups.com when this is fixed, I will re-enable the instance.quoted
quoted
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: (detected by 1, t=10502 jiffies, g=6629, q=331) rcu: All QSes seen, last rcu_preempt kthread activity 10503 (4294953794-4294943291), jiffies_till_next_fqs=1, root ->qsmask 0x0 syz-executor.0 R running task 24648 8293 8292 0x0000400a Call Trace: <IRQ> sched_show_task+0x40f/0x560 kernel/sched/core.c:5954 print_other_cpu_stall kernel/rcu/tree_stall.h:410 [inline] check_cpu_stall kernel/rcu/tree_stall.h:538 [inline] rcu_pending kernel/rcu/tree.c:2827 [inline] rcu_sched_clock_irq+0x1861/0x1ad0 kernel/rcu/tree.c:2271 update_process_times+0x12d/0x180 kernel/time/timer.c:1726 tick_sched_handle kernel/time/tick-sched.c:167 [inline] tick_sched_timer+0x263/0x420 kernel/time/tick-sched.c:1310 __run_hrtimer kernel/time/hrtimer.c:1514 [inline] __hrtimer_run_queues+0x403/0x840 kernel/time/hrtimer.c:1576 hrtimer_interrupt+0x38c/0xda0 kernel/time/hrtimer.c:1638 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1110 [inline] smp_apic_timer_interrupt+0x109/0x280 arch/x86/kernel/apic/apic.c:1135 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829 </IRQ> RIP: 0010:__read_once_size include/linux/compiler.h:199 [inline] RIP: 0010:check_kcov_mode kernel/kcov.c:70 [inline] RIP: 0010:__sanitizer_cov_trace_pc+0x1c/0x50 kernel/kcov.c:102 Code: cc 07 48 89 de e8 64 02 3b 00 5b 5d c3 cc 48 8b 04 24 65 48 8b 0c 25 c0 1d 02 00 65 8b 15 b8 81 8b 7e f7 c2 00 01 1f 00 75 2c <8b> 91 80 13 00 00 83 fa 02 75 21 48 8b 91 88 13 00 00 48 8b 32 48 RSP: 0018:ffffc900021c7c28 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffffffff81487433 RBX: 0000000000000000 RCX: ffff88809428a100 RDX: 0000000000000001 RSI: 00000000fffffffc RDI: ffffea0002479240 RBP: ffffc900021c7c50 R08: dffffc0000000000 R09: fffffbfff1287025 R10: fffffbfff1287025 R11: 0000000000000000 R12: dffffc0000000000 R13: dffffc0000000000 R14: 00000000fffffffc R15: ffff888091c57428 free_thread_stack+0x168/0x590 kernel/fork.c:280 release_task_stack kernel/fork.c:440 [inline] put_task_stack+0xa3/0x130 kernel/fork.c:451 finish_task_switch+0x3f1/0x550 kernel/sched/core.c:3256 context_switch kernel/sched/core.c:3388 [inline] __schedule+0x9a8/0xcc0 kernel/sched/core.c:4081 preempt_schedule_common kernel/sched/core.c:4236 [inline] preempt_schedule+0xdb/0x120 kernel/sched/core.c:4261 ___preempt_schedule+0x16/0x18 arch/x86/entry/thunk_64.S:50 __raw_read_unlock include/linux/rwlock_api_smp.h:227 [inline] _raw_read_unlock+0x3a/0x40 kernel/locking/spinlock.c:255 kill_something_info kernel/signal.c:1586 [inline] __do_sys_kill kernel/signal.c:3640 [inline] __se_sys_kill+0x5e9/0x6c0 kernel/signal.c:3634 __x64_sys_kill+0x5b/0x70 kernel/signal.c:3634 do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x422a17 Code: 44 00 00 48 c7 c2 d4 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 b8 3e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 dd 32 ff ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fff38dca538 EFLAGS: 00000293 ORIG_RAX: 000000000000003e RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 0000000000422a17 RDX: 0000000000000bb8 RSI: 0000000000000009 RDI: 00000000fffffffe RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000001c62940 R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000008 R13: 00007fff38dca570 R14: 000000000000f0b6 R15: 00007fff38dca580 rcu: rcu_preempt kthread starved for 10533 jiffies! g6629 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0 rcu: RCU grace-period kthread stack dump: rcu_preempt R running task 29032 10 2 0x80004008 Call Trace: context_switch kernel/sched/core.c:3388 [inline] __schedule+0x9a8/0xcc0 kernel/sched/core.c:4081 schedule+0x181/0x210 kernel/sched/core.c:4155 schedule_timeout+0x14f/0x240 kernel/time/timer.c:1895 rcu_gp_fqs_loop kernel/rcu/tree.c:1661 [inline] rcu_gp_kthread+0xed8/0x1770 kernel/rcu/tree.c:1821 kthread+0x332/0x350 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. -- You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/00000000000036decf0598c8762e%40google.com.