Thread (15 messages) 15 messages, 8 authors, 2021-05-14

Re: [PATCH net-next] net: Treat __napi_schedule_irqoff() as __napi_schedule() on PREEMPT_RT

From: Alison Chaiken <hidden>
Date: 2021-05-14 19:44:30
Also in: lkml, netdev

On Fri, May 14, 2021 at 11:56 AM Jakub Kicinski [off-list ref] wrote:
On Thu, 13 May 2021 00:28:02 +0200 Thomas Gleixner wrote:
quoted
On Wed, May 12 2021 at 23:43, Sebastian Andrzej Siewior wrote:
quoted
__napi_schedule_irqoff() is an optimized version of __napi_schedule()
which can be used where it is known that interrupts are disabled,
e.g. in interrupt-handlers, spin_lock_irq() sections or hrtimer
callbacks.

On PREEMPT_RT enabled kernels this assumptions is not true. Force-
threaded interrupt handlers and spinlocks are not disabling interrupts
and the NAPI hrtimer callback is forced into softirq context which runs
with interrupts enabled as well.

Chasing all usage sites of __napi_schedule_irqoff() is a whack-a-mole
game so make __napi_schedule_irqoff() invoke __napi_schedule() for
PREEMPT_RT kernels.

The callers of ____napi_schedule() in the networking core have been
audited and are correct on PREEMPT_RT kernels as well.

Reported-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Thomas Gleixner <redacted>
quoted
---
Alternatively __napi_schedule_irqoff() could be #ifdef'ed out on RT and
an inline provided which invokes __napi_schedule().

This was not chosen as it creates #ifdeffery all over the place and with
the proposed solution the code reflects the documentation consistently
and in one obvious place.
Blame me for that decision.

No matter which variant we end up with, this needs to go into all stable
RT kernels ASAP.
Mumble mumble. I thought we concluded that drivers used on RT can be
fixed, we've already done it for a couple drivers (by which I mean two).
If all the IRQ handler is doing is scheduling NAPI (which it is for
modern NICs) - IRQF_NO_THREAD seems like the right option.

Is there any driver you care about that we can convert to using
IRQF_NO_THREAD so we can have new drivers to "do the right thing"
while the old ones depend on this workaround for now?


Another thing while I have your attention - ____napi_schedule() does
__raise_softirq_irqoff() which AFAIU does not wake the ksoftirq thread.
On non-RT we get occasional NOHZ warnings when drivers schedule napi
from process context, but on RT this is even more of a problem, right?
ksoftirqd won't run until something else actually wakes it up?
By "NOHZ warnings," do you mean "NOHZ: local_softirq_pending"?    We see
that message about once a week with 4.19.   Presumably any failure of
____napi_schedule() to wake ksoftirqd could only cause problems for the
NET_RX softirq, so if the pending softirq is different, the cause lies
elsewhere.

-- Alison Chaiken
   Aurora Innovation
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help