Thread (4 messages) 4 messages, 3 authors, 2019-12-30

Re: INFO: rcu detected stall in br_handle_frame (2)

From: Eric Dumazet <hidden>
Date: 2019-12-28 15:02:06
Also in: lkml
Subsystem: networking [general], tc subsystem, the rest · Maintainers: "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jamal Hadi Salim, Jiri Pirko, Linus Torvalds


On 12/28/19 3:15 AM, Florian Westphal wrote:
syzbot [off-list ref] wrote:

[ CC Eric, fq related ]
quoted
syzbot found the following crash on:

HEAD commit:    7e0165b2 Merge branch 'akpm' (patches from Andrew)
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=116ec09ee00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1b59a3066828ac4c
dashboard link: https://syzkaller.appspot.com/bug?extid=dc9071cc5a85950bdfce
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=159182c1e00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1221218ee00000

Bisection is inconclusive: the bug happens on the oldest tested release.

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=158224c1e00000
final crash:    https://syzkaller.appspot.com/x/report.txt?x=178224c1e00000
console output: https://syzkaller.appspot.com/x/log.txt?x=138224c1e00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+dc9071cc5a85950bdfce@syzkaller.appspotmail.com

rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
	(detected by 0, t=10502 jiffies, g=10149, q=201)
rcu: All QSes seen, last rcu_preempt kthread activity 10502
(4294978441-4294967939), jiffies_till_next_fqs=1, root ->qsmask 0x0
sshd            R  running task    26584 10034   9965 0x00000008
Call Trace:
 <IRQ>
 sched_show_task kernel/sched/core.c:5954 [inline]
[..]

The reproducer sets up 'fq' sched with TCA_FQ_QUANTUM == 0x80000000

This causes infinite loop in fq_dequeue:

if (f->credit <= 0) {
  f->credit += q->quantum;
  goto begin;
}

... because f->credit is either 0 or -2147483648.

Eric, what is a 'sane' ->quantum value?

One could simply add a 'quantum > 0 && quantum < INT_MAX'
constraint afaics.

If you don't have a better idea/suggestion for an upperlimit INT_MAX
would be enough to prevent perpetual <= 0 condition.
Thanks Florian for the analysis.

I guess we could use a conservative upper bound value of (1 << 20)
( about 16 64KB packets )
diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index ff4c5e9d0d7778d86f20f4bd67cc627eed0713d9..12f1d1c6044fac9db987f7ce3a50a7e2c711358b 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -786,15 +786,20 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt,
        if (tb[TCA_FQ_QUANTUM]) {
                u32 quantum = nla_get_u32(tb[TCA_FQ_QUANTUM]);
 
-               if (quantum > 0)
+               if (quantum > 0 && quantum <= (1 << 20))
                        q->quantum = quantum;
                else
                        err = -EINVAL;
        }
 
-       if (tb[TCA_FQ_INITIAL_QUANTUM])
-               q->initial_quantum = nla_get_u32(tb[TCA_FQ_INITIAL_QUANTUM]);
+       if (tb[TCA_FQ_INITIAL_QUANTUM]) {
+               u32 quantum = nla_get_u32(tb[TCA_FQ_INITIAL_QUANTUM]);
 
+               if (quantum > 0 && quantum <= (1 << 20))
+                       q->initial_quantum = quantum;
+               else
+                       err = -EINVAL;
+       }
        if (tb[TCA_FQ_FLOW_DEFAULT_RATE])
                pr_warn_ratelimited("sch_fq: defrate %u ignored.\n",
                                    nla_get_u32(tb[TCA_FQ_FLOW_DEFAULT_RATE]));
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help