Re: [PATCH] fix: net/batman-adv: batadv_interface_kill_vid: extra batadv_meshif_vlan_put after destroy
From: Sven Eckelmann <sven@narfation.org>
Date: 2026-06-27 06:16:14
Also in:
batman, lkml, stable
On Sat, 27 Jun 2026 11:46:36 +0800, WenTao Liang [off-list ref] wrote:
Hi,
not-acked
1. please don't send patches to netdev directly. See (from any recent
batadv.git, netdev/net.git netdev/net-next.git or torvalds/linux.git):
$ ./scripts/get_maintainer.pl 20260627034636.59693-1-vulab@iscas.ac.cn.mbx
Marek Lindner [off-list ref] (maintainer:BATMAN ADVANCED,blamed_fixes:1/1=100%)
Simon Wunderlich [off-list ref] (maintainer:BATMAN ADVANCED)
Antonio Quartulli [off-list ref] (maintainer:BATMAN ADVANCED,blamed_fixes:1/1=100%)
Sven Eckelmann [off-list ref] (maintainer:BATMAN ADVANCED)
b.a.t.m.a.n@lists.open-mesh.org (moderated list:BATMAN ADVANCED)
linux-kernel@vger.kernel.org (open list)
2. please add after the "PATCH" the tree which it should enter (in this case
"[PATCH batadv]". See:
./scripts/get_maintainer.pl --scm 20260627034636.59693-1-vulab@iscas.ac.cn.mbx|grep '^git'
git https://git.open-mesh.org/batadv.git
git git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
3. Please use a subject line which follows the kernel style. See
https://docs.kernel.org/process/submitting-patches.html#the-canonical-patch-formatA
- no "fix: "
- "batman-adv: " instead of "net/batman-adv: "
- most likely no "batadv_interface_kill_vid: "
- an actual summary of your change (because right now it says it adds(?) an extra put)
In batadv_interface_kill_vid(), batadv_meshif_vlan_get() acquires a reference on the vlan object. batadv_meshif_destroy_vlan() internally calls batadv_meshif_vlan_put() which balances that reference. However, an
No, this doesn't balance the reference. The reference put in this function is for the reference acquired by this function. The batadv_meshif_destroy_vlan() put is for the reference for its "from .ndo_vlan_rx_add_vid till .ndo_vlan_rx_kill_vid" lifetime. You can see exactly the same approach also in batadv_meshif_destroy_netlink() for its "untagged" vlan. A function which you didn't touch.
additional batadv_meshif_vlan_put(vlan) is called after batadv_meshif_destroy_vlan(), causing a refcount underflow and potential use-after-free of the vlan object.
No, doesn't cause an underflow in my setup. Please explain exactly how you tested this and came the conclusion that this would cause a use-after-free. Because I can't reproduce this and the patch in this form is causing a memory leak for me.
Remove the extra batadv_meshif_vlan_put(vlan) call.
No, this can't be the correct solution.
quoted hunk ↗ jump to hunk
diff --git a/net/batman-adv/mesh-interface.c b/net/batman-adv/mesh-interface.c index e5a55d24..7a1aeeca 100644 --- a/net/batman-adv/mesh-interface.c +++ b/net/batman-adv/mesh-interface.c@@ -693,9 +693,6 @@ static int batadv_interface_kill_vid(struct net_device *dev, __be16 proto, batadv_meshif_destroy_vlan(bat_priv, vlan); - /* finally free the vlan object */ - batadv_meshif_vlan_put(vlan); -
This looks wrong to me. Now it leaks the VLAN which was acquired at the
beginning of the function. When I add a kref_get-printk right before the
batadv_meshif_destroy_vlan() and in batadv_tt_local_entry_release() before the
puts:
refcnt before batadv_meshif_destroy_vlan: 3
refcnt after batadv_meshif_destroy_vlan: 2
refcnt before batadv_tt_local_entry_release: 2
refcnt after batadv_tt_local_entry_release: 1
As you can see, now the VLAN never reaches the 0 and thus isn't free'd. You can
also directly see the memory leak (which didn't happen before):
root@node01:~# ip l del dev bat0.10
[ 18.127153][ T368] refcnt before batadv_meshif_destroy_vlan: 3
[ 18.128792][ T368] refcnt after batadv_meshif_destroy_vlan: 2
[ 18.649318][ T12] refcnt before batadv_tt_local_entry_release: 2
[ 18.650220][ T12] refcnt after batadv_tt_local_entry_release: 1
root@node01:~# rmmod batman-adv
[ 27.033891][ T374] batman_adv: bat0: Interface deactivated: dummy0
[ 27.034522][ T374] batman_adv: bat0: Removing interface: dummy0
[ 27.038340][ T374] batman_adv: bat0: Interface deactivated: enp0s1
[ 27.038973][ T374] batman_adv: bat0: Removing interface: enp0s1
[ 27.044439][ T374] br0: port 1(bat0) entered disabled state
[ 27.049110][ T374] bat0 (unregistering): left allmulticast mode
[ 27.049486][ T374] bat0 (unregistering): left promiscuous mode
[ 27.049804][ T374] br0: port 1(bat0) entered disabled state
[ 27.096326][ T374] refcnt before batadv_tt_local_entry_release: 1
[ 27.096851][ T374] refcnt after batadv_tt_local_entry_release: 0
root@node01:~# modprobe batman-adv
root@node01:~# echo scan > /sys/kernel/debug/kmemleak
root@node01:~# echo scan > /sys/kernel/debug/kmemleak
[ 41.460324][ T361] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
root@node01:~# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff88800ab1bd00 (size 64):
comm "ip", pid 300, jiffies 4294893634
hex dump (first 32 bytes):
c0 cb c7 13 80 88 ff ff 0a 80 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 552e6e51):
kmemleak_alloc+0x55/0xa0
__kmalloc_cache_noprof+0x2f4/0x540
batadv_meshif_create_vlan+0x7c/0x450 [batman_adv]
batadv_interface_add_vid+0xb6/0xd0 [batman_adv]
vlan_add_rx_filter_info+0xee/0x160
vlan_vid_add+0x2f6/0x910
register_vlan_dev+0xc5/0x6f0
vlan_newlink+0x40e/0x6f0
rtnl_newlink_create+0x2e1/0x770
__rtnl_newlink+0x20b/0x9d0
rtnl_newlink+0x7f7/0xf90
rtnetlink_rcv_msg+0x811/0xbf0
netlink_rcv_skb+0x148/0x3f0
rtnetlink_rcv+0x19/0x20
netlink_unicast+0x5fc/0xa50
netlink_sendmsg+0x82b/0xd70
Because of the errors this patch introduces and the form of the patch: will not
be applied in batadv.git
We can discuss an actual fix when you can explain us how this problem can
actually be reproduced.
--
Sven Eckelmann [off-list ref]