Re: INFO: rcu detected stall in ext4_write_checks
From: Peter Zijlstra <peterz@infradead.org>
Date: 2019-07-15 13:47:16
Also in:
linux-ext4, lkml
On Mon, Jul 15, 2019 at 03:33:11PM +0200, Dmitry Vyukov wrote:
On Mon, Jul 15, 2019 at 3:29 PM Peter Zijlstra [off-list ref] wrote:quoted
On Sun, Jul 14, 2019 at 11:49:15AM -0700, Paul E. McKenney wrote:quoted
On Sun, Jul 14, 2019 at 05:48:00PM +0300, Dmitry Vyukov wrote:quoted
But short term I don't see any other solution than stop testing sched_setattr because it does not check arguments enough to prevent system misbehavior. Which is a pity because syzkaller has found some bad misconfigurations that were oversight on checking side. Any other suggestions?Keep the times down to a few seconds? Of course, that might also fail to find interesting bugs.Right, if syzcaller can put a limit on the period/deadline parameters (and make sure to not write "-1" to /proc/sys/kernel/sched_rt_runtime_us) then per the in-kernel access-control should not allow these things to happen.Since we are racing with emails, could you suggest a 100% safe parameters? Because I only hear people saying "safe", "sane", "well-behaving" :) If we move the check to user-space, it does not mean that we can get away without actually defining what that means.
Right, well, that's part of the problem. I think Paul just did the reverse math and figured that 95% of X must not be larger than my watchdog timeout and landed on 14 seconds. I'm thinking 4 seconds (or rather 4.294967296) would be a very nice number.
Now thinking of this, if we come up with some simple criteria, could we have something like a sysctl that would allow only really "safe" parameters?
I suppose we could do that, something like:
sysctl_deadline_period_{min,max}. I'll have to dig back a bit on where
we last talked about that and what the problems where.
For one, setting the min is a lot harder, but I suppose we can start at
TICK_NSEC or something.