Thread (27 messages) 27 messages, 6 authors, 2021-02-26

Re: [PATCH net] net: fix race between napi kthread mode and busy poll

From: Jakub Kicinski <kuba@kernel.org>
Date: 2021-02-25 00:21:46
Subsystem: networking drivers, networking [general], the rest · Maintainers: Andrew Lunn, "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds

Possibly related (same subject, not in this thread)

On Thu, 25 Feb 2021 00:11:34 +0000 Alexander Duyck wrote:
quoted
quoted
We were trying to not pollute the list (with about 40 different emails
so far)

(Note this was not something I initiated, I only hit Reply all button)

OK, I will shut up, since you seem to take over this matter, and it is
1am here in France.  
Are you okay with adding a SCHED_THREADED bit for threaded NAPI to be
set in addition to SCHED? At least that way the bit is associated with it's user.
IIUC since the extra clear_bit() in busy poll was okay so should be a new
set_bit()?  
The problem with adding a bit for SCHED_THREADED is that you would
have to heavily modify napi_schedule_prep so that it would add the
bit. That is the reason for going with adding the bit to the busy
poll logic because it added no additional overhead. Adding another
atomic bit setting operation or heavily modifying the existing one
would add considerable overhead as it is either adding a complicated
conditional check to all NAPI calls, or adding an atomic operation to
the path for the threaded NAPI.
I wasn't thinking of modifying the main schedule logic, just the
threaded parts:

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ddf4cfc12615..6953005d06af 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -360,6 +360,7 @@ enum {
        NAPI_STATE_IN_BUSY_POLL,        /* sk_busy_loop() owns this NAPI */
        NAPI_STATE_PREFER_BUSY_POLL,    /* prefer busy-polling over softirq processing*/
        NAPI_STATE_THREADED,            /* The poll is performed inside its own thread*/
+       NAPI_STATE_SCHED_THREAD,        /* Thread owns the NAPI and will poll */
 };
 
 enum {
diff --git a/net/core/dev.c b/net/core/dev.c
index 6c5967e80132..23e53f971478 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4294,6 +4294,7 @@ static inline void ____napi_schedule(struct softnet_data *sd,
                 */
                thread = READ_ONCE(napi->thread);
                if (thread) {
+                       set_bit(NAPI_STATE_SCHED_THREAD, &napi->state);
                        wake_up_process(thread);
                        return;
                }
@@ -6486,7 +6487,8 @@ bool napi_complete_done(struct napi_struct *n, int work_done)
                WARN_ON_ONCE(!(val & NAPIF_STATE_SCHED));
 
                new = val & ~(NAPIF_STATE_MISSED | NAPIF_STATE_SCHED |
-                             NAPIF_STATE_PREFER_BUSY_POLL);
+                             NAPIF_STATE_PREFER_BUSY_POLL |
+                             NAPI_STATE_SCHED_THREAD);
 
                /* If STATE_MISSED was set, leave STATE_SCHED set,
                 * because we will call napi->poll() one more time.
@@ -6971,7 +6973,9 @@ static int napi_thread_wait(struct napi_struct *napi)
        set_current_state(TASK_INTERRUPTIBLE);
 
        while (!kthread_should_stop() && !napi_disable_pending(napi)) {
-               if (test_bit(NAPI_STATE_SCHED, &napi->state)) {
+               if (test_bit(NAPI_STATE_SCHED_THREAD, &napi->state)) {
+                       WARN_ON(!test_bit(test_bit(NAPI_STATE_SCHED,
+                                                  &napi->state)));
                        WARN_ON(!list_empty(&napi->poll_list));
                        __set_current_state(TASK_RUNNING);
                        return 0;
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help