Thread (9 messages) 9 messages, 3 authors, 2021-04-09

Re: [PATCH net] net: fix hangup on napi_disable for threaded napi

From: Paolo Abeni <pabeni@redhat.com>
Date: 2021-04-09 09:24:14
Subsystem: networking [general], the rest · Maintainers: "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds

On Wed, 2021-04-07 at 11:13 -0700, Jakub Kicinski wrote:
On Wed, 07 Apr 2021 16:54:29 +0200 Paolo Abeni wrote:
quoted
quoted
quoted
I think in the above example even the normal processing will be
fooled?!? e.g. even without the napi_disable(), napi_thread_wait() will
 will miss the event/will not understand to it really own the napi and
will call schedule().

It looks a different problem to me ?!?

I *think* that replacing inside the napi_thread_wait() loop:

	if (test_bit(NAPI_STATE_SCHED_THREADED, &napi->state) || woken) 

with:

	unsigned long state = READ_ONCE(napi->state);

	if (state & NAPIF_STATE_SCHED &&
	    !(state & (NAPIF_STATE_IN_BUSY_POLL | NAPIF_STATE_DISABLE)) 

should solve it and should also allow removing the
NAPI_STATE_SCHED_THREADED bit. I feel like I'm missing some relevant
point here.  
Heh, that's closer to the proposal Eric put forward.

I strongly dislike   
I guess that can't be addressed ;)
I'm not _that_ unreasonable, I hope :) if there is multiple people
disagreeing with me then so be it.
I'm sorry, I mean no offence! Just joking about the fact that is
usually hard changing preferences ;)
quoted
If you have strong opinion against the above, the only other option I
can think of is patching napi_schedule_prep() to set
both NAPI_STATE_SCHED and NAPI_STATE_SCHED_THREADED if threaded mode is
enabled for the running NAPI. That looks more complex and error prone,
so I really would avoid that.

Any other better option?

Side note: regardless of the above, I think we still need something
similar to the code in this patch, can we address the different issues
separately?
Not sure what issues you're referring to.
The patch that started this thread was ment to address a slightly
different race: napi_disable() hanging because napi_threaded_poll()
don't clear the NAPI_STATE_SCHED even when owning the napi instance.
quoted hunk ↗ jump to hunk
Right, I think the problem is disable_pending check is out of place.

How about this:
diff --git a/net/core/dev.c b/net/core/dev.c
index 9d1a8fac793f..e53f8bfed6a1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7041,7 +7041,7 @@ static int napi_thread_wait(struct napi_struct *napi)
 
        set_current_state(TASK_INTERRUPTIBLE);
 
-       while (!kthread_should_stop() && !napi_disable_pending(napi)) {
+       while (!kthread_should_stop()) {
                /* Testing SCHED_THREADED bit here to make sure the current
                 * kthread owns this napi and could poll on this napi.
                 * Testing SCHED bit is not enough because SCHED bit might be
@@ -7049,8 +7049,14 @@ static int napi_thread_wait(struct napi_struct *napi)
                 */
                if (test_bit(NAPI_STATE_SCHED_THREADED, &napi->state) || woken) {
                        WARN_ON(!list_empty(&napi->poll_list));
-                       __set_current_state(TASK_RUNNING);
-                       return 0;
+                       if (unlikely(napi_disable_pending(napi))) {
+                               clear_bit(NAPI_STATE_SCHED, &napi->state);
+                               clear_bit(NAPI_STATE_SCHED_THREADED,
+                                         &napi->state);
+                       } else {
+                               __set_current_state(TASK_RUNNING);
+                               return 0;
+                       }
                }
 
                schedule();
It looks like the above works, and also fixes the problem I originally
reported. 

I think it can be simplified as the following - if NAPIF_STATE_DISABLE
is set, napi_threaded_poll()/__napi_poll() will return clearing the
SCHEDS bits after trying to poll the device:

---
diff --git a/net/core/dev.c b/net/core/dev.c
index d9db02d4e044..5cb6f411043d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6985,7 +6985,7 @@ static int napi_thread_wait(struct napi_struct *napi)
 
        set_current_state(TASK_INTERRUPTIBLE);
 
-       while (!kthread_should_stop() && !napi_disable_pending(napi)) {
+       while (!kthread_should_stop()) {
                /* Testing SCHED_THREADED bit here to make sure the current
                 * kthread owns this napi and could poll on this napi.
                 * Testing SCHED bit is not enough because SCHED bit might be

---
And works as intended here. I'll submit that formally, unless there are
objection.

Thanks!

Paolo
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help