Re: [PATCH net] net: hsr: avoid potential out-of-bound access in fill_frame_info()
From: Stephan Wurm <hidden>
Date: 2025-01-21 15:16:04
Am 20. Jan 13:24 hat Eric Dumazet geschrieben:
On Mon, Jan 20, 2025 at 8:32 AM Stephan Wurm [off-list ref] wrote:quoted
Applying the new instrumentation gives me the following stack trace: kernel: skb len=170 headroom=2 headlen=170 tailroom=20 mac=(2,14) mac_len=14 net=(16,-1) trans=-1 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0x0 start=0 offset=0 ip_summed=0 complete_sw=0 valid=0 level=0) hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0 priority=0x0 mark=0x0 alloc_cpu=0 vlan_all=0x0 encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0) kernel: dev name=prp0 feat=0x0000000000007000 kernel: sk family=17 type=3 proto=0 kernel: skb headroom: 00000000: 74 00 kernel: skb linear: 00000000: 01 0c cd 01 00 01 00 d0 93 53 9c cb 81 00 80 00 kernel: skb linear: 00000010: 88 b8 00 01 00 98 00 00 00 00 61 81 8d 80 16 52 kernel: skb linear: 00000020: 45 47 44 4e 43 54 52 4c 2f 4c 4c 4e 30 24 47 4f kernel: skb linear: 00000030: 24 47 6f 43 62 81 01 14 82 16 52 45 47 44 4e 43 kernel: skb linear: 00000040: 54 52 4c 2f 4c 4c 4e 30 24 44 73 47 6f 6f 73 65 kernel: skb linear: 00000050: 83 07 47 6f 49 64 65 6e 74 84 08 67 8d f5 93 7e kernel: skb linear: 00000060: 76 c8 00 85 01 01 86 01 00 87 01 00 88 01 01 89 kernel: skb linear: 00000070: 01 00 8a 01 02 ab 33 a2 15 83 01 00 84 03 03 00 kernel: skb linear: 00000080: 00 91 08 67 8d f5 92 77 4b c6 1f 83 01 00 a2 1a kernel: skb linear: 00000090: a2 06 85 01 00 83 01 00 84 03 03 00 00 91 08 67 kernel: skb linear: 000000a0: 8d f5 92 77 4b c6 1f 83 01 00 kernel: skb tailroom: 00000000: 80 18 02 00 fe 4e 00 00 01 01 08 0a 4f fd 5e d1 kernel: skb tailroom: 00000010: 4f fd 5e cd kernel: ------------[ cut here ]------------ kernel: WARNING: CPU: 0 PID: 751 at /net/hsr/hsr_forward.c:605 fill_frame_info+0x180/0x19c kernel: Modules linked in: kernel: CPU: 0 PID: 751 Comm: reg61850 Not tainted 6.6.69-ga7a5cc0c39f0 #1 kernel: Hardware name: Freescale LS1021A kernel: unwind_backtrace from show_stack+0x10/0x14 kernel: show_stack from dump_stack_lvl+0x40/0x4c kernel: dump_stack_lvl from __warn+0x94/0xc0 kernel: __warn from warn_slowpath_fmt+0x1b4/0x1bc kernel: warn_slowpath_fmt from fill_frame_info+0x180/0x19c kernel: fill_frame_info from hsr_forward_skb+0x54/0x118 kernel: hsr_forward_skb from hsr_dev_xmit+0x60/0xc4 kernel: hsr_dev_xmit from dev_hard_start_xmit+0xa0/0xe4 kernel: dev_hard_start_xmit from __dev_queue_xmit+0x144/0x5e8 kernel: __dev_queue_xmit from packet_snd+0x5c0/0x784 kernel: packet_snd from sock_write_iter+0xa0/0x10c kernel: sock_write_iter from vfs_write+0x3ac/0x41c kernel: vfs_write from ksys_write+0xbc/0xf0 kernel: ksys_write from ret_fast_syscall+0x0/0x4c kernel: Exception stack(0xc0d8dfa8 to 0xc0d8dff0) kernel: dfa0: 000000aa 73058e53 00000012 73058e53 000000aa 00000000 kernel: dfc0: 000000aa 73058e53 00000012 00000004 6ebf9940 0000000a 00000000 00000000 kernel: dfe0: 00000004 6ebf90f8 766a17ad 7661e5e6 kernel: ---[ end trace 0000000000000000 ]---Thanks. So hsr_forward() can be used in tx path, not forwarding as its name would imply... what a mess. It looks like it might be an af_packet issue, or an application bug. packet_parse_headers() should really for this vlan packet set the network header correctly. Otherwise, I do not see how hsr could use mac_len at all. I am afraid commit 48b491a5cc74333c ("net: hsr: fix mac_len checks") was not good enough.
I did some additional experiments. First, I was able to get v6.13 running on the system, although it did not fix my issue. Then I played around with VLAN interfaces. I created an explicit VLAN interface on top of the prp interface. In that case the VLAN header gets transparently attached to the tx frames and forwarding through the interface layers works as expected. It was even possible to get my application working on top of the vlan interface, but it resulted in two VLAN headers - the inner from the application, the outer from the vlan interface. So when sending vlan tagged frames directly from an application through a prp interface the mac_len field does not get updated, even though the VLAN protocol header is properly detected; when sending frames through an explicit vlan interface, the mac_len seems to be properly parsed into the skb. Now I am running out of ideas how to proceed. For the time being I would locally revert this fix, to make my application working again. But I can support in testing proposed solutions.