Re: [PATCH bpf-next] bpf: prevent non-IPv4 socket to be added into sock hash
From: John Fastabend <john.fastabend@gmail.com>
Date: 2018-05-31 23:32:04
Subsystem:
bpf [general] (safe dynamic programs and tools), the rest · Maintainers:
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Linus Torvalds
On 05/30/2018 02:29 PM, Wei Wang wrote:
From: Wei Wang <redacted>
Sock hash only supports IPv4 socket proto right now.
If a non-IPv4 socket gets stored in the BPF map, sk->sk_prot gets
overwritten with the v4 tcp prot.
Syskaller reported the following related issue on an IPv6 socket:
BUG: KASAN: slab-out-of-bounds in ip6_dst_idev include/net/ip6_fib.h:203 [inline]
BUG: KASAN: slab-out-of-bounds in ip6_xmit+0x2002/0x23f0 net/ipv6/ip6_output.c:264
Read of size 8 at addr ffff8801b300edb0 by task syz-executor888/4522
CPU: 0 PID: 4522 Comm: syz-executor888 Not tainted 4.17.0-rc4+ #17
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
print_address_description+0x6c/0x20b mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
__asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
ip6_dst_idev include/net/ip6_fib.h:203 [inline]
ip6_xmit+0x2002/0x23f0 net/ipv6/ip6_output.c:264
inet6_csk_xmit+0x377/0x630 net/ipv6/inet6_connection_sock.c:139
tcp_transmit_skb+0x1be0/0x3e40 net/ipv4/tcp_output.c:1159
tcp_send_syn_data net/ipv4/tcp_output.c:3441 [inline]
tcp_connect+0x2207/0x45a0 net/ipv4/tcp_output.c:3480
tcp_v4_connect+0x1934/0x1d50 net/ipv4/tcp_ipv4.c:272
__inet_stream_connect+0x943/0x1120 net/ipv4/af_inet.c:655
tcp_sendmsg_fastopen net/ipv4/tcp.c:1162 [inline]
tcp_sendmsg_locked+0x2859/0x3ee0 net/ipv4/tcp.c:1209
tcp_sendmsg+0x2f/0x50 net/ipv4/tcp.c:1447
inet_sendmsg+0x19f/0x690 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:629 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:639
___sys_sendmsg+0x805/0x940 net/socket.c:2117
__sys_sendmsg+0x115/0x270 net/socket.c:2155
__do_sys_sendmsg net/socket.c:2164 [inline]
__se_sys_sendmsg net/socket.c:2162 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2162
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x43ff99
RSP: 002b:00007ffc00bd1cf8 EFLAGS: 00000217 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043ff99
RDX: 0000000020000000 RSI: 0000000020000580 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 00000000004002c8 R11: 0000000000000217 R12: 00000000004018c0
R13: 0000000000401950 R14: 0000000000000000 R15: 0000000000000000
Fixes: 81110384441a ("bpf: sockmap, add hash map support")
Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
Signed-off-by: Wei Wang <redacted>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
---
Hi Wei,
Thanks for the report and fix. It would be better to fix the
root cause so that IPv6 works as intended.
I'm testing the following now,
Author: John Fastabend [off-list ref]
Date: Thu May 31 14:38:59 2018 -0700
sockmap: fix crash when ipv6 sock is added by adding support for IPv6
Apparently we had a testing escape and missed IPv6. This fixes a crash
where we assign tcp_prot to IPv6 sockets instead of tcpv6_prot.
Signed-off-by: John Fastabend [off-list ref]
diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 52a91d8..e191122 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c@@ -41,6 +41,7 @@ #include <linux/mm.h> #include <net/strparser.h> #include <net/tcp.h> +#include <net/transp_v6.h> #include <linux/ptr_ring.h> #include <net/inet_common.h> #include <linux/sched/signal.h>
@@ -162,6 +163,8 @@ static bool bpf_tcp_stream_read(const struct sock *sk) } static struct proto tcp_bpf_proto; +static struct proto tcpv6_bpf_proto; + static int bpf_tcp_init(struct sock *sk) { struct smap_psock *psock;
@@ -182,13 +185,21 @@ static int bpf_tcp_init(struct sock *sk) psock->sk_proto = sk->sk_prot; if (psock->bpf_tx_msg) { + tcpv6_bpf_proto.sendmsg = bpf_tcp_sendmsg; + tcpv6_bpf_proto.sendpage = bpf_tcp_sendpage; + tcpv6_bpf_proto.recvmsg = bpf_tcp_recvmsg; + tcpv6_bpf_proto.stream_memory_read = bpf_tcp_stream_read; tcp_bpf_proto.sendmsg = bpf_tcp_sendmsg; tcp_bpf_proto.sendpage = bpf_tcp_sendpage; tcp_bpf_proto.recvmsg = bpf_tcp_recvmsg; tcp_bpf_proto.stream_memory_read = bpf_tcp_stream_read; } - sk->sk_prot = &tcp_bpf_proto; + if (sk->sk_family == AF_INET6) + sk->sk_prot = &tcpv6_bpf_proto; + else + sk->sk_prot = &tcp_bpf_proto; + rcu_read_unlock(); return 0; }
@@ -1113,6 +1124,8 @@ static int bpf_tcp_ulp_register(void) { tcp_bpf_proto = tcp_prot; tcp_bpf_proto.close = bpf_tcp_close; + tcpv6_bpf_proto = tcpv6_prot; + tcpv6_bpf_proto.close = bpf_tcp_close; /* Once BPF TX ULP is registered it is never unregistered. It * will be in the ULP list for the lifetime of the system. Doing * duplicate registers is not a problem.