Thread (6 messages) 6 messages, 4 authors, 2020-01-07

Re: INFO: rcu detected stall in sys_sendfile64

From: Dmitry Vyukov <dvyukov@google.com>
Date: 2020-01-07 13:03:02
Also in: lkml

On Sat, Jan 4, 2020 at 12:09 PM Tetsuo Handa
[off-list ref] wrote:
On 2018/12/20 3:42, Dmitry Vyukov wrote:
quoted
On Wed, Dec 19, 2018 at 11:13 AM Tetsuo Handa
[off-list ref] wrote:
quoted
On 2018/12/19 18:27, syzbot wrote:
quoted
HEAD commit:    ddfbab46539f Merge tag 'scsi-fixes' of git://git.kernel.or..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15b87fa3400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=861a3573f4e78ba1
dashboard link: https://syzkaller.appspot.com/bug?extid=bcad772bbc241b4c6147
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13912ccd400000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=145781db400000
This is not a LSM problem, for the reproducer is calling
sched_setattr(SCHED_DEADLINE) with very large values.

  sched_setattr(0, {size=0, sched_policy=0x6 /* SCHED_??? */, sched_flags=0, sched_nice=0, sched_priority=0, sched_runtime=2251799813724439, sched_deadline=4611686018427453437, sched_period=0}, 0) = 0

I think that this problem is nothing but an insane sched_setattr() parameter.

#syz invalid
Note there was another one with sched_setattr, which turned out to be
some serious problem in kernel (sched_setattr should not cause CPU
stall for 3 minutes):
INFO: rcu detected stall in do_idle
https://syzkaller.appspot.com/bug?extid=385468161961cee80c31
https://groups.google.com/forum/#!msg/syzkaller-bugs/crrfvusGtwI/IoD_zus4BgAJ

Maybe it another incarnation of the same bug, that one is still not fixed.
Can we let syzbot blacklist sched_setattr() for now? There are many stall reports
doing sched_setattr(SCHED_RR) which makes it difficult to find stall reports not
using sched_setattr().
Hi Tetsuo,

If we start practice of disabling whole syscalls, I would really like
"for now" to be very well defined. When will it end? How will it
happen? Is the problem on the radar of relevant people? Will it stay
on somebody's radar until it's fixed? Normal practise of project
sheriffing is to file a P1 bug assigned to somebody when something
gets disabled. But I am not sure how we implement this for kernel.
Since the problem is there for a long time and we disable it without
defining any criteria, I afraid we disable it forever (then more bugs
will pile and re-enabling it will be painful). At the very least we
need to acknowledge that we stopping testing schedler for foreseeable
future and schedler maintainers need to be notified about this.
Blacklisting it and un-blacklisting will cause some churn. Was the bug
given at least some attention? Significant number of bugs are
relatively easy to fix and fixing it would solve all of the problems
in a much better way.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help