[PATCH net v2] netpoll: fix a use-after-free on shutdown path
From: Breno Leitao <leitao@debian.org>
Date: 2026-06-25 12:05:08
Also in:
lkml, stable
Subsystem:
networking [general], the rest · Maintainers:
"David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds
There is a use-after-free error on netpoll, which is clearly detected by
KASAN.
BUG: KASAN: slab-use-after-free in _raw_spin_lock_irqsave+0x3b/0x80
Read of size 1 at addr ... by task kworker/9:1
Workqueue: events queue_process
Call Trace:
skb_dequeue+0x1e/0xb0
queue_process+0x2c/0x600
process_scheduled_works+0x4b6/0x850
worker_thread+0x414/0x5a0
Allocated by task 242:
__netpoll_setup+0x201/0x4a0
netpoll_setup+0x249/0x550
enabled_store+0x32f/0x380
Freed by task 0:
kfree+0x1b7/0x540
rcu_core+0x3f8/0x7a0
The problem happens when there is a pending TX worker running in
parallel with the cleanup path.
This is what happens on netpoll shutdown path:
1) __netpoll_cleanup() is called
2) set dev->npinfo to NULL
3) call_rcu() with rcu_cleanup_netpoll_info()
3.1) rcu_cleanup_netpoll_info() tries to cancel all workers with
cancel_delayed_work(), but doesn't wait for the worker to finish
4) and kfree(npinfo);
Because 3.1) doesn't really cancel the work, as the comment says "we
can't call cancel_delayed_work_sync here, as we are in softirq", the TX
worker can run after 4).
Tl;DR: queue_process() is not an RCU reader, it reaches npinfo through
the work item via container_of().
Use disable_delayed_work_sync() to ensure the worker is completely
stopped and prevent any future re-arming attempts. Once npinfo is set
to NULL, senders will bail out and not queue new work. The disable flag
ensures any in-flight re-arming attempts also fail silently.
In the future, we can do the cleanup inline here without needing the
npinfo->rcu rcu_head, but that is net-next material.
Cc: stable@vger.kernel.org
Fixes: 38e6bc185d95 ("netpoll: make __netpoll_cleanup non-block")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
Changes in v2:
- Remove the synchronize_rcu() and keep cancel the tx_work
before call_rcu(). (Jakub)
- Link to v1: https://lore.kernel.org/r/20260622-netpoll_rcu_fix-v1-1-15c3285e92e6@debian.org (local)
---
net/core/netpoll.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 229dde818ab33..96d5945e6a30f 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c@@ -633,14 +633,6 @@ static void rcu_cleanup_netpoll_info(struct rcu_head *rcu_head) container_of(rcu_head, struct netpoll_info, rcu); skb_queue_purge(&npinfo->txq); - - /* we can't call cancel_delayed_work_sync here, as we are in softirq */ - cancel_delayed_work(&npinfo->tx_work); - - /* clean after last, unfinished work */ - __skb_queue_purge(&npinfo->txq); - /* now cancel it again */ - cancel_delayed_work(&npinfo->tx_work); kfree(npinfo); }
@@ -664,6 +656,7 @@ static void __netpoll_cleanup(struct netpoll *np) ops->ndo_netpoll_cleanup(np->dev); RCU_INIT_POINTER(np->dev->npinfo, NULL); + disable_delayed_work_sync(&npinfo->tx_work); call_rcu(&npinfo->rcu, rcu_cleanup_netpoll_info); }
--- base-commit: d07d80b6a129a44538cda1549b7acf95154fb197 change-id: 20260622-netpoll_rcu_fix-def7bce1207a Best regards, -- Breno Leitao [off-list ref]