Thread (16 messages) 16 messages, 5 authors, 2024-10-08

Re: selftests: net: pmtu.sh: Unable to handle kernel paging request at virtual address

From: Xin Long <lucien.xin@gmail.com>
Date: 2024-10-06 18:08:30
Also in: lkml

Sorry for bringing up this issue, it recently occurred on my aarch64 kernel
with blackhole_netdev backported. I tracked it down, and when deleting
the netns, the path is:

In cleanup_net():

  default_device_exit_batch()
    unregister_netdevice_many()
      addrconf_ifdown() -> call_rcu(rcu, fib6_info_destroy_rcu) <--- [1]
    netdev_run_todo()
      rcu_barrier() <- [2]
  ip6_route_net_exit() -> dst_entries_destroy(net->ip6_dst_ops) <--- [3]

In fib6_info_destroy_rcu():

  dst_dev_put()
  dst_release() -> call_rcu(rcu, dst_destroy_rcu) <--- [5]

In dst_destroy_rcu():
  dst_destroy() -> dst_entries_add(dst->ops, -1); <--- [6]

fib6_info_destroy_rcu() is scheduled at [1], rcu_barrier() will wait
for fib6_info_destroy_rcu() to be done at [2]. However, another callback
dst_destroy_rcu() is scheduled() in fib6_info_destroy_rcu() at [5], and
there's no place calling rcu_barrier() to wait for dst_destroy_rcu() to
be done. It means dst_entries_add() at [6] might be run later than
dst_entries_destroy() at [3], then this UAF will trigger the panic.

On Tue, Oct 17, 2023 at 1:02 PM Naresh Kamboju
[off-list ref] wrote:
On Tue, 5 Sept 2023 at 17:55, Eric Dumazet [off-list ref] wrote:
quoted
On Tue, Sep 5, 2023 at 1:52 PM Hillf Danton [off-list ref] wrote:
quoted
On Mon, 4 Sep 2023 13:29:57 +0200 Eric Dumazet [off-list ref]
quoted
On Sun, Sep 3, 2023 at 5:57=E2=80=AFAM Hillf Danton [off-list ref]
quoted
On Thu, 31 Aug 2023 15:12:30 +0200 Eric Dumazet [off-list ref]
quoted
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -163,8 +163,13 @@ EXPORT_SYMBOL(dst_dev_put);

 void dst_release(struct dst_entry *dst)
 {
-       if (dst && rcuref_put(&dst->__rcuref))
+       if (dst && rcuref_put(&dst->__rcuref)) {
+               if (!(dst->flags & DST_NOCOUNT)) {
+                       dst->flags |= DST_NOCOUNT;
+                       dst_entries_add(dst->ops, -1);
So I think it makes sense to NOT call dst_entries_add() in the path
dst_destroy_rcu() -> dst_destroy(), as it does on the patch above,
but I don't see it get posted.

Hi, Eric, would you like to move forward with your patch above ?

Or we can also move the dst_entries_add(dst->ops, -1) from dst_destroy()
to dst_release():

Note, dst_destroy() is not used outside net/core/dst.c, we may delete
EXPORT_SYMBOL(dst_destroy) in the future.

Thanks.
quoted
quoted
quoted
quoted
Could this add happen after the rcu sync above?
I do not think so. All dst_release() should happen before netns removal.
        cpu2                    cpu3
        ====                    ====
        cleanup_net()           __sys_sendto
                                sock_sendmsg()
                                udpv6_sendmsg()
        synchronize_rcu();
                                dst_release()

Could this one be an exception?
No idea what you are trying to say.

Please give exact locations, instead of being rather vague.

Note that an UDP socket can not send a packet while its netns is dismantled,
because alive sockets keep a reference on the netns.
Gentle reminder.
This is still an open issue.

# selftests: net: pmtu.sh
# TEST: ipv4: PMTU exceptions                                         [ OK ]
# TEST: ipv4: PMTU exceptions - nexthop objects                       [ OK ]
# TEST: ipv6: PMTU exceptions                                         [ OK ]
# TEST: ipv6: PMTU exceptions - nexthop objects                       [ OK ]
# TEST: ICMPv4 with DSCP and ECN: PMTU exceptions                     [ OK ]
# TEST: ICMPv4 with DSCP and ECN: PMTU exceptions - nexthop objects   [ OK ]
# TEST: UDPv4 with DSCP and ECN: PMTU exceptions                      [ OK ]
# TEST: UDPv4 with DSCP and ECN: PMTU exceptions - nexthop objects    [ OK ]
# TEST: IPv4 over vxlan4: PMTU exceptions                             [ OK ]
# TEST: IPv4 over vxlan4: PMTU exceptions - nexthop objects           [ OK ]
# TEST: IPv6 over vxlan4: PMTU exceptions                             [ OK ]
# TEST: IPv6 over vxlan4: PMTU exceptions - nexthop objects           [ OK ]
# TEST: IPv4 over vxlan6: PMTU exceptions                             [ OK ]
<1>[  155.820793] Unable to handle kernel paging request at virtual
address ffff247020442000
<1>[  155.821495] Mem abort info:
<1>[  155.821719]   ESR = 0x0000000097b58004
<1>[  155.822046]   EC = 0x25: DABT (current EL), IL = 32 bits
<1>[  155.822412]   SET = 0, FnV = 0
<1>[  155.822648]   EA = 0, S1PTW = 0
<1>[  155.822925]   FSC = 0x04: level 0 translation fault
<1>[  155.823317] Data abort info:
<1>[  155.823590]   Access size = 4 byte(s)
<1>[  155.823886]   SSE = 1, SRT = 21
<1>[  155.824167]   SF = 1, AR = 0
<1>[  155.824450]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
<1>[  155.824847]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
<1>[  155.825345] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041d84000
<1>[  155.827244] [ffff247020442000] pgd=0000000000000000, p4d=0000000000000000
<0>[  155.828511] Internal error: Oops: 0000000097b58004 [#1] PREEMPT SMP
<4>[  155.829155] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel
act_csum libcrc32c act_pedit cls_flower sch_prio veth vrf macvtap
macvlan tap crct10dif_ce sm3_ce sm3 sha3_ce sha512_ce sha512_arm64
fuse drm backlight dm_mod ip_tables x_tables [last unloaded:
test_blackhole_dev]
<4>[  155.832289] CPU: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 6.6.0-rc6 #1
<4>[  155.832896] Hardware name: linux,dummy-virt (DT)
<4>[  155.833927] pstate: 824000c9 (Nzcv daIF +PAN -UAO +TCO -DIT
-SSBS BTYPE=--)
<4>[  155.834496] pc : percpu_counter_add_batch+0x24/0xcc
<4>[  155.835735] lr : dst_destroy+0x44/0x1e4

Links:
- https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.6-rc6/testrun/20613439/suite/log-parser-test/test/check-kernel-oops/log
- https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.6-rc6/testrun/20613439/suite/log-parser-test/tests/

- Naresh
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help