Thread (31 messages) 31 messages, 6 authors, 2019-07-23

Re: INFO: rcu detected stall in ext4_write_checks

From: Paul E. McKenney <hidden>
Date: 2019-07-15 14:02:26
Also in: linux-ext4, lkml

On Mon, Jul 15, 2019 at 03:46:51PM +0200, Peter Zijlstra wrote:
On Mon, Jul 15, 2019 at 03:33:11PM +0200, Dmitry Vyukov wrote:
quoted
On Mon, Jul 15, 2019 at 3:29 PM Peter Zijlstra [off-list ref] wrote:
quoted
On Sun, Jul 14, 2019 at 11:49:15AM -0700, Paul E. McKenney wrote:
quoted
On Sun, Jul 14, 2019 at 05:48:00PM +0300, Dmitry Vyukov wrote:
quoted
But short term I don't see any other solution than stop testing
sched_setattr because it does not check arguments enough to prevent
system misbehavior. Which is a pity because syzkaller has found some
bad misconfigurations that were oversight on checking side.
Any other suggestions?
Keep the times down to a few seconds?  Of course, that might also
fail to find interesting bugs.
Right, if syzcaller can put a limit on the period/deadline parameters
(and make sure to not write "-1" to
/proc/sys/kernel/sched_rt_runtime_us) then per the in-kernel
access-control should not allow these things to happen.
Since we are racing with emails, could you suggest a 100% safe
parameters? Because I only hear people saying "safe", "sane",
"well-behaving" :)
If we move the check to user-space, it does not mean that we can get
away without actually defining what that means.
Right, well, that's part of the problem. I think Paul just did the
reverse math and figured that 95% of X must not be larger than my
watchdog timeout and landed on 14 seconds.
I was actually working backwards from thw 21-second RCU CPU stall
timeout, but there are likely many other limits to consider.
I'm thinking 4 seconds (or rather 4.294967296) would be a very nice
number.
Works for me!  That should give the various RCU kthreads ample
opportunities to execute within the RCU CPU stall timeout.

The rcuo callback-offload kthreads will need special handling, but if
someone has 100 CPUs wildly generating callbacks and allocates but one
CPU to invoke them, there is not much either the RCU or the scheduler
can do to make that work.  ;-)

							Thanx, Paul
quoted
Now thinking of this, if we come up with some simple criteria, could
we have something like a sysctl that would allow only really "safe"
parameters?
I suppose we could do that, something like:
sysctl_deadline_period_{min,max}. I'll have to dig back a bit on where
we last talked about that and what the problems where.

For one, setting the min is a lot harder, but I suppose we can start at
TICK_NSEC or something.
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help