Re: INFO: rcu detected stall in sys_sendfile64
From: Dmitry Vyukov <dvyukov@google.com>
Date: 2020-01-07 13:03:02
Also in:
lkml
On Sat, Jan 4, 2020 at 12:09 PM Tetsuo Handa [off-list ref] wrote:
On 2018/12/20 3:42, Dmitry Vyukov wrote:quoted
On Wed, Dec 19, 2018 at 11:13 AM Tetsuo Handa [off-list ref] wrote:quoted
On 2018/12/19 18:27, syzbot wrote:quoted
HEAD commit: ddfbab46539f Merge tag 'scsi-fixes' of git://git.kernel.or.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=15b87fa3400000 kernel config: https://syzkaller.appspot.com/x/.config?x=861a3573f4e78ba1 dashboard link: https://syzkaller.appspot.com/bug?extid=bcad772bbc241b4c6147 compiler: gcc (GCC) 8.0.1 20180413 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13912ccd400000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=145781db400000This is not a LSM problem, for the reproducer is calling sched_setattr(SCHED_DEADLINE) with very large values. sched_setattr(0, {size=0, sched_policy=0x6 /* SCHED_??? */, sched_flags=0, sched_nice=0, sched_priority=0, sched_runtime=2251799813724439, sched_deadline=4611686018427453437, sched_period=0}, 0) = 0 I think that this problem is nothing but an insane sched_setattr() parameter. #syz invalidNote there was another one with sched_setattr, which turned out to be some serious problem in kernel (sched_setattr should not cause CPU stall for 3 minutes): INFO: rcu detected stall in do_idle https://syzkaller.appspot.com/bug?extid=385468161961cee80c31 https://groups.google.com/forum/#!msg/syzkaller-bugs/crrfvusGtwI/IoD_zus4BgAJ Maybe it another incarnation of the same bug, that one is still not fixed.Can we let syzbot blacklist sched_setattr() for now? There are many stall reports doing sched_setattr(SCHED_RR) which makes it difficult to find stall reports not using sched_setattr().
Hi Tetsuo, If we start practice of disabling whole syscalls, I would really like "for now" to be very well defined. When will it end? How will it happen? Is the problem on the radar of relevant people? Will it stay on somebody's radar until it's fixed? Normal practise of project sheriffing is to file a P1 bug assigned to somebody when something gets disabled. But I am not sure how we implement this for kernel. Since the problem is there for a long time and we disable it without defining any criteria, I afraid we disable it forever (then more bugs will pile and re-enabling it will be painful). At the very least we need to acknowledge that we stopping testing schedler for foreseeable future and schedler maintainers need to be notified about this. Blacklisting it and un-blacklisting will cause some churn. Was the bug given at least some attention? Significant number of bugs are relatively easy to fix and fixing it would solve all of the problems in a much better way.