Thread (6 messages) 6 messages, 3 authors, 2020-01-08

Re: INFO: rcu detected stall in sys_kill

From: Daniel Axtens <hidden>
Date: 2019-12-17 13:39:02
Also in: lkml

Daniel Axtens [off-list ref] writes:
Hi Casey,
quoted
There haven't been Smack changes recently, so this is
going to have been introduced elsewhere. I'm perfectly
willing to accept that Smack is doing something horribly
wrong WRT rcu, and that it needs repair, but its going to
be tough for me to track down. I hope someone else is looking
into this, as my chances of finding the problem are pretty
slim.
Yeah, I'm having a look, it's probably related to my kasan-vmalloc
stuff. It's currently in a bit of flux as syzkaller finds a bunch of
other bugs with it, once that stablises a bit I'll come back to Smack.
I have had a brief and wildly unsuccessful look at this. I'm happy to
come back to it and go over it with a finer toothed comb, but it will
almost certainly have to wait until next year.

I don't think it's related to RCU, we also have a plain lockup:
https://syzkaller.appspot.com/bug?id=be03729d17bb3b2df1754a7486a8f8628f6ff1ec

Dmitry, I've been really struggling to repro this locally, even with
your config. Is there an easy way to see the kernel command line you
booted with and anything else that makes this image special? I have zero
experience with smack so this is a steep learning curve.

Regards,
Daniel
Regards,
Daniel
quoted
quoted
quoted
I see 2 common this across all stalls:
1. They all happen on the instance that uses smack (which is now
effectively dead), see smack instance here:
https://syzkaller.appspot.com/upstream
2. They all contain this frame in the stack trace:
free_thread_stack+0x168/0x590 kernel/fork.c:280
The last commit that touches this file is "fork: support VMAP_STACK
with KASAN_VMALLOC".
That may be very likely the root cause. +Daniel
I've stopped smack syzbot instance b/c it produces infinite stream of
assorted crashes due to this.
Please ping syzkaller@googlegroups.com when this is fixed, I will
re-enable the instance.
quoted
quoted
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
        (detected by 1, t=10502 jiffies, g=6629, q=331)
rcu: All QSes seen, last rcu_preempt kthread activity 10503
(4294953794-4294943291), jiffies_till_next_fqs=1, root ->qsmask 0x0
syz-executor.0  R  running task    24648  8293   8292 0x0000400a
Call Trace:
  <IRQ>
  sched_show_task+0x40f/0x560 kernel/sched/core.c:5954
  print_other_cpu_stall kernel/rcu/tree_stall.h:410 [inline]
  check_cpu_stall kernel/rcu/tree_stall.h:538 [inline]
  rcu_pending kernel/rcu/tree.c:2827 [inline]
  rcu_sched_clock_irq+0x1861/0x1ad0 kernel/rcu/tree.c:2271
  update_process_times+0x12d/0x180 kernel/time/timer.c:1726
  tick_sched_handle kernel/time/tick-sched.c:167 [inline]
  tick_sched_timer+0x263/0x420 kernel/time/tick-sched.c:1310
  __run_hrtimer kernel/time/hrtimer.c:1514 [inline]
  __hrtimer_run_queues+0x403/0x840 kernel/time/hrtimer.c:1576
  hrtimer_interrupt+0x38c/0xda0 kernel/time/hrtimer.c:1638
  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1110 [inline]
  smp_apic_timer_interrupt+0x109/0x280 arch/x86/kernel/apic/apic.c:1135
  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
  </IRQ>
RIP: 0010:__read_once_size include/linux/compiler.h:199 [inline]
RIP: 0010:check_kcov_mode kernel/kcov.c:70 [inline]
RIP: 0010:__sanitizer_cov_trace_pc+0x1c/0x50 kernel/kcov.c:102
Code: cc 07 48 89 de e8 64 02 3b 00 5b 5d c3 cc 48 8b 04 24 65 48 8b 0c 25
c0 1d 02 00 65 8b 15 b8 81 8b 7e f7 c2 00 01 1f 00 75 2c <8b> 91 80 13 00
00 83 fa 02 75 21 48 8b 91 88 13 00 00 48 8b 32 48
RSP: 0018:ffffc900021c7c28 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: ffffffff81487433 RBX: 0000000000000000 RCX: ffff88809428a100
RDX: 0000000000000001 RSI: 00000000fffffffc RDI: ffffea0002479240
RBP: ffffc900021c7c50 R08: dffffc0000000000 R09: fffffbfff1287025
R10: fffffbfff1287025 R11: 0000000000000000 R12: dffffc0000000000
R13: dffffc0000000000 R14: 00000000fffffffc R15: ffff888091c57428
  free_thread_stack+0x168/0x590 kernel/fork.c:280
  release_task_stack kernel/fork.c:440 [inline]
  put_task_stack+0xa3/0x130 kernel/fork.c:451
  finish_task_switch+0x3f1/0x550 kernel/sched/core.c:3256
  context_switch kernel/sched/core.c:3388 [inline]
  __schedule+0x9a8/0xcc0 kernel/sched/core.c:4081
  preempt_schedule_common kernel/sched/core.c:4236 [inline]
  preempt_schedule+0xdb/0x120 kernel/sched/core.c:4261
  ___preempt_schedule+0x16/0x18 arch/x86/entry/thunk_64.S:50
  __raw_read_unlock include/linux/rwlock_api_smp.h:227 [inline]
  _raw_read_unlock+0x3a/0x40 kernel/locking/spinlock.c:255
  kill_something_info kernel/signal.c:1586 [inline]
  __do_sys_kill kernel/signal.c:3640 [inline]
  __se_sys_kill+0x5e9/0x6c0 kernel/signal.c:3634
  __x64_sys_kill+0x5b/0x70 kernel/signal.c:3634
  do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x422a17
Code: 44 00 00 48 c7 c2 d4 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff c3 66 2e
0f 1f 84 00 00 00 00 00 0f 1f 40 00 b8 3e 00 00 00 0f 05 <48> 3d 01 f0 ff
ff 0f 83 dd 32 ff ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fff38dca538 EFLAGS: 00000293 ORIG_RAX: 000000000000003e
RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 0000000000422a17
RDX: 0000000000000bb8 RSI: 0000000000000009 RDI: 00000000fffffffe
RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000001c62940
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000008
R13: 00007fff38dca570 R14: 000000000000f0b6 R15: 00007fff38dca580
rcu: rcu_preempt kthread starved for 10533 jiffies! g6629 f0x2
RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: RCU grace-period kthread stack dump:
rcu_preempt     R  running task    29032    10      2 0x80004008
Call Trace:
  context_switch kernel/sched/core.c:3388 [inline]
  __schedule+0x9a8/0xcc0 kernel/sched/core.c:4081
  schedule+0x181/0x210 kernel/sched/core.c:4155
  schedule_timeout+0x14f/0x240 kernel/time/timer.c:1895
  rcu_gp_fqs_loop kernel/rcu/tree.c:1661 [inline]
  rcu_gp_kthread+0xed8/0x1770 kernel/rcu/tree.c:1821
  kthread+0x332/0x350 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

--
You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/00000000000036decf0598c8762e%40google.com.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help