[PATCH v1 net-next 00/14] net: Support per-netns device unregistration
From: Kuniyuki Iwashima <kuniyu@google.com>
Date: 2026-07-01 21:43:43
The biggest blocker to per-netns RTNL is netdev unregistration.
It starts within a single netns, but it can eventually involve
multiple namespaces.
There are three types of such cross-netns devices:
1. Paired devices (e.g., netkit, veth, vxcan)
-> Unregistering one device also deletes its peer, which
may reside in another netns.
2. Tunnel devices (e.g., bareudp, geneve, etc)
-> Destroying a netns removes devices in another netns if
their backend sockets reside in the dying netns
3. Stacked devices (e.g., ipvlan, macvlan, etc)
-> Removing the lower device also removes multiple upper
devices, each of which may reside in different namespaces.
While the first two device types require at most two rtnl_net_lock()s,
the stacked type has no upper limit. This makes it impossible to
freeze all necessary namespaces in advance.
This series introduces per-netns work, initially suggested at
NetConf 2024, to delegate the unregistration of such cross-netns
devices.
https://netdev.bots.linux.dev/netconf/2024/kuniyu.pdf#page=62
The first half of the series wraps NETDEV_UNREGISTER (in core) with
per-netns RTNL, adds a helper for per-netns device unregistration,
and forces per-netns device unregistration in the core code when
CONFIG_DEBUG_NET_SMALL_RTNL=y.
The latter half picks out one from each type (veth, bareudp, ipvlan)
and converts them to support per-netns device unregistration,
although the operations are **still serialised under RTNL** for now.
Please note that this series focuses only on the device unregistration
paths. For example, there are ASSERT_RTNL() left in other paths, and
Sashiko may point it out, but they are out of scope.
This is just the first step, and we need more incremental changes to
completely remove RTNL anyway.
Now, we can see that unregistering a lower device (veth0 below)
removes upper devices (ipvl2, ipvl3) in different namespaces using
per-netns work with a different PID. The lower device (veth0) is
freed only after all upper ipvlan devices have called netdev_put()
in ipvlan_uninit().
# ip netns add ns1
# ip netns add ns2
# ip netns add ns3
# ip -n ns1 link add veth0 type veth peer veth1
# ip -n ns2 link add ipvl2 link veth0 link-netns ns1 type ipvlan mode l2
# ip -n ns3 link add ipvl3 link veth0 link-netns ns1 type ipvlan mode l2
# ip -n ns1 link del veth0
# bpftrace -e '#include <linux/netdevice.h>
kprobe:ipvlan_uninit,
kprobe:veth_dellink,
kprobe:free_netdev {
$dev = (struct net_device *)arg0;
printf("PID: %d | DEV: %s%s\n", pid, $dev->name, kstack());
}'
PID: 2010 | DEV: veth0
veth_dellink+5
rtnl_dellink+1213
rtnetlink_rcv_msg+1791
...
PID: 440 | DEV: ipvl2
ipvlan_uninit+5
unregister_netdevice_many_notify+7129
unregister_netdevice_many_net+1050
rtnl_net_work_func+136
...
PID: 440 | DEV: ipvl2
free_netdev+5
netdev_run_todo+4798
process_scheduled_works+2538
...
PID: 440 | DEV: ipvl3
ipvlan_uninit+5
unregister_netdevice_many_notify+7129
unregister_netdevice_many_net+1050
rtnl_net_work_func+136
process_scheduled_works+2538
...
PID: 2010 | DEV: veth0
free_netdev+5
netdev_run_todo+4798
rtnl_dellink+1507
rtnetlink_rcv_msg+1791
...
PID: 440 | DEV: ipvl3
free_netdev+5
netdev_run_todo+4798
process_scheduled_works+2538
...
Kuniyuki Iwashima (14):
rtnetlink: Lock sock_net(skb->sk) in rtnl_newlink().
rtnetlink: Call unregister_netdevice_many() only once in
rtnl_link_unregister().
rtnetlink: Add per-netns rtnl_work.
net: Wrap default_device_exit_net() with __rtnl_net_lock().
net: Hold __rtnl_net_lock() in netdev_wait_allrefs_any().
net: Add per-netns netdev unregistration infra.
net: Call unregister_netdevice_many() per netns.
veth: Support per-netns device unregistration.
bareudp: Protect bareudp_list with mutex.
bareudp: Support per-netns netdev unregistration.
ipvlan: Convert ipvl_port.count to refcount_t.
ipvlan: Synchronise ipvlan_init() and ipvlan_uninit() for the same
lower dev.
ipvlan: Protect ipvl_port.ipvlans with mutex.
ipvlan: Support per-netns netdev unregistration.
drivers/net/bareudp.c | 43 ++++++++-
drivers/net/ipvlan/ipvlan.h | 18 +++-
drivers/net/ipvlan/ipvlan_main.c | 153 +++++++++++++++++++++++++------
drivers/net/ipvlan/ipvtap.c | 16 ++--
drivers/net/veth.c | 34 ++++---
include/linux/netdevice.h | 22 +++++
include/linux/rtnetlink.h | 8 ++
include/net/net_namespace.h | 3 +
net/core/dev.c | 129 +++++++++++++++++++++++++-
net/core/net_namespace.c | 4 +
net/core/rtnetlink.c | 57 ++++++++++--
11 files changed, 418 insertions(+), 69 deletions(-)
--
2.55.0.rc0.799.gd6f94ed593-goog