Re: INFO: rcu detected stall in br_handle_frame (2)
From: Eric Dumazet <hidden>
Date: 2019-12-28 15:02:06
Also in:
lkml
Subsystem:
networking [general], tc subsystem, the rest · Maintainers:
"David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jamal Hadi Salim, Jiri Pirko, Linus Torvalds
On 12/28/19 3:15 AM, Florian Westphal wrote:
syzbot [off-list ref] wrote: [ CC Eric, fq related ]quoted
syzbot found the following crash on: HEAD commit: 7e0165b2 Merge branch 'akpm' (patches from Andrew) git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=116ec09ee00000 kernel config: https://syzkaller.appspot.com/x/.config?x=1b59a3066828ac4c dashboard link: https://syzkaller.appspot.com/bug?extid=dc9071cc5a85950bdfce compiler: gcc (GCC) 9.0.0 20181231 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=159182c1e00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1221218ee00000 Bisection is inconclusive: the bug happens on the oldest tested release. bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=158224c1e00000 final crash: https://syzkaller.appspot.com/x/report.txt?x=178224c1e00000 console output: https://syzkaller.appspot.com/x/log.txt?x=138224c1e00000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+dc9071cc5a85950bdfce@syzkaller.appspotmail.com rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: (detected by 0, t=10502 jiffies, g=10149, q=201) rcu: All QSes seen, last rcu_preempt kthread activity 10502 (4294978441-4294967939), jiffies_till_next_fqs=1, root ->qsmask 0x0 sshd R running task 26584 10034 9965 0x00000008 Call Trace: <IRQ> sched_show_task kernel/sched/core.c:5954 [inline][..] The reproducer sets up 'fq' sched with TCA_FQ_QUANTUM == 0x80000000 This causes infinite loop in fq_dequeue: if (f->credit <= 0) { f->credit += q->quantum; goto begin; } ... because f->credit is either 0 or -2147483648. Eric, what is a 'sane' ->quantum value? One could simply add a 'quantum > 0 && quantum < INT_MAX' constraint afaics. If you don't have a better idea/suggestion for an upperlimit INT_MAX would be enough to prevent perpetual <= 0 condition.
Thanks Florian for the analysis. I guess we could use a conservative upper bound value of (1 << 20) ( about 16 64KB packets )
diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index ff4c5e9d0d7778d86f20f4bd67cc627eed0713d9..12f1d1c6044fac9db987f7ce3a50a7e2c711358b 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c@@ -786,15 +786,20 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt, if (tb[TCA_FQ_QUANTUM]) { u32 quantum = nla_get_u32(tb[TCA_FQ_QUANTUM]); - if (quantum > 0) + if (quantum > 0 && quantum <= (1 << 20)) q->quantum = quantum; else err = -EINVAL; } - if (tb[TCA_FQ_INITIAL_QUANTUM]) - q->initial_quantum = nla_get_u32(tb[TCA_FQ_INITIAL_QUANTUM]); + if (tb[TCA_FQ_INITIAL_QUANTUM]) { + u32 quantum = nla_get_u32(tb[TCA_FQ_INITIAL_QUANTUM]); + if (quantum > 0 && quantum <= (1 << 20)) + q->initial_quantum = quantum; + else + err = -EINVAL; + } if (tb[TCA_FQ_FLOW_DEFAULT_RATE]) pr_warn_ratelimited("sch_fq: defrate %u ignored.\n", nla_get_u32(tb[TCA_FQ_FLOW_DEFAULT_RATE]));