Re: [PATCH bpf-next] bpf: prevent non-IPv4 socket to be added into sock hash

From: John Fastabend <john.fastabend@gmail.com>
Date: 2018-05-31 23:32:04
Subsystem: bpf [general] (safe dynamic programs and tools), the rest · Maintainers: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Linus Torvalds

On 05/30/2018 02:29 PM, Wei Wang wrote:

From: Wei Wang <redacted>

Sock hash only supports IPv4 socket proto right now.
If a non-IPv4 socket gets stored in the BPF map, sk->sk_prot gets
overwritten with the v4 tcp prot.

Syskaller reported the following related issue on an IPv6 socket:
BUG: KASAN: slab-out-of-bounds in ip6_dst_idev include/net/ip6_fib.h:203 [inline]
BUG: KASAN: slab-out-of-bounds in ip6_xmit+0x2002/0x23f0 net/ipv6/ip6_output.c:264
Read of size 8 at addr ffff8801b300edb0 by task syz-executor888/4522

CPU: 0 PID: 4522 Comm: syz-executor888 Not tainted 4.17.0-rc4+ #17
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 print_address_description+0x6c/0x20b mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
 ip6_dst_idev include/net/ip6_fib.h:203 [inline]
 ip6_xmit+0x2002/0x23f0 net/ipv6/ip6_output.c:264
 inet6_csk_xmit+0x377/0x630 net/ipv6/inet6_connection_sock.c:139
 tcp_transmit_skb+0x1be0/0x3e40 net/ipv4/tcp_output.c:1159
 tcp_send_syn_data net/ipv4/tcp_output.c:3441 [inline]
 tcp_connect+0x2207/0x45a0 net/ipv4/tcp_output.c:3480
 tcp_v4_connect+0x1934/0x1d50 net/ipv4/tcp_ipv4.c:272
 __inet_stream_connect+0x943/0x1120 net/ipv4/af_inet.c:655
 tcp_sendmsg_fastopen net/ipv4/tcp.c:1162 [inline]
 tcp_sendmsg_locked+0x2859/0x3ee0 net/ipv4/tcp.c:1209
 tcp_sendmsg+0x2f/0x50 net/ipv4/tcp.c:1447
 inet_sendmsg+0x19f/0x690 net/ipv4/af_inet.c:798
 sock_sendmsg_nosec net/socket.c:629 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:639
 ___sys_sendmsg+0x805/0x940 net/socket.c:2117
 __sys_sendmsg+0x115/0x270 net/socket.c:2155
 __do_sys_sendmsg net/socket.c:2164 [inline]
 __se_sys_sendmsg net/socket.c:2162 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2162
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x43ff99
RSP: 002b:00007ffc00bd1cf8 EFLAGS: 00000217 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043ff99
RDX: 0000000020000000 RSI: 0000000020000580 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 00000000004002c8 R11: 0000000000000217 R12: 00000000004018c0
R13: 0000000000401950 R14: 0000000000000000 R15: 0000000000000000

Fixes: 81110384441a ("bpf: sockmap, add hash map support")
Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
Signed-off-by: Wei Wang <redacted>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
---

Hi Wei,

Thanks for the report and fix. It would be better to fix the
root cause so that IPv6 works as intended.

I'm testing the following now,

Author: John Fastabend [off-list ref]
Date:   Thu May 31 14:38:59 2018 -0700

    sockmap: fix crash when ipv6 sock is added by adding support for IPv6
    
    Apparently we had a testing escape and missed IPv6. This fixes a crash
    where we assign tcp_prot to IPv6 sockets instead of tcpv6_prot.
    
    Signed-off-by: John Fastabend [off-list ref]

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 52a91d8..e191122 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c

@@ -41,6 +41,7 @@
 #include <linux/mm.h>
 #include <net/strparser.h>
 #include <net/tcp.h>
+#include <net/transp_v6.h>
 #include <linux/ptr_ring.h>
 #include <net/inet_common.h>
 #include <linux/sched/signal.h>

@@ -162,6 +163,8 @@ static bool bpf_tcp_stream_read(const struct sock *sk)
 }
 
 static struct proto tcp_bpf_proto;
+static struct proto tcpv6_bpf_proto;
+
 static int bpf_tcp_init(struct sock *sk)
 {
        struct smap_psock *psock;

@@ -182,13 +185,21 @@ static int bpf_tcp_init(struct sock *sk)
        psock->sk_proto = sk->sk_prot;
 
        if (psock->bpf_tx_msg) {
+               tcpv6_bpf_proto.sendmsg = bpf_tcp_sendmsg;
+               tcpv6_bpf_proto.sendpage = bpf_tcp_sendpage;
+               tcpv6_bpf_proto.recvmsg = bpf_tcp_recvmsg;
+               tcpv6_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
                tcp_bpf_proto.sendmsg = bpf_tcp_sendmsg;
                tcp_bpf_proto.sendpage = bpf_tcp_sendpage;
                tcp_bpf_proto.recvmsg = bpf_tcp_recvmsg;
                tcp_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
        }
 
-       sk->sk_prot = &tcp_bpf_proto;
+       if (sk->sk_family == AF_INET6)
+               sk->sk_prot = &tcpv6_bpf_proto;
+       else
+               sk->sk_prot = &tcp_bpf_proto;
+
        rcu_read_unlock();
        return 0;
 }

@@ -1113,6 +1124,8 @@ static int bpf_tcp_ulp_register(void)
 {
        tcp_bpf_proto = tcp_prot;
        tcp_bpf_proto.close = bpf_tcp_close;
+       tcpv6_bpf_proto = tcpv6_prot;
+       tcpv6_bpf_proto.close = bpf_tcp_close;
        /* Once BPF TX ULP is registered it is never unregistered. It
         * will be in the ULP list for the lifetime of the system. Doing
         * duplicate registers is not a problem.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help