Re: KASAN: stack-out-of-bounds Read in __schedule

From: Dmitry Vyukov <dvyukov@google.com>
Date: 2018-08-30 15:40:59
Also in: linux-ext4, lkml

On Thu, Aug 30, 2018 at 7:19 AM, Dmitry Vyukov [off-list ref] wrote:

On Thu, Aug 30, 2018 at 2:52 AM, Daniel Borkmann [off-list ref] wrote:

quoted

Hello,

syzbot found the following crash on:

HEAD commit:    5b394b2ddf03 Linux 4.19-rc1
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14f4d8e1400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=49927b422dcf0b29
dashboard link: https://syzkaller.appspot.com/bug?extid=45a34334c61a8ecf661d
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13127e5a400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+45a34334c61a8ecf661d@syzkaller.appspotmail.com

IPv6: ADDRCONF(NETDEV_UP): veth1: link is not ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
8021q: adding VLAN 0 to HW filter on device team0
==================================================================
BUG: KASAN: stack-out-of-bounds in schedule_debug kernel/sched/core.c:3285
[inline]
BUG: KASAN: stack-out-of-bounds in __schedule+0x1977/0x1df0
kernel/sched/core.c:3395
Read of size 8 at addr ffff8801ad090000 by task syz-executor0/4718

Weird, can you please help me decipher this? So here KASAN complains about
wrong memory access in the scheduler.

This looks like a result of a previous bad silent memory corruption.

The KASAN report says there is a stack out-of-bounds in scheduler. And
that if followed by slab corruption report in another task.

fs/jbd2/transaction.c happens to be the first meaningful file in this
crash, and so that's where it is attributed to.

Rerunning the reproducer several times can maybe give some better
glues, or maybe not, maybe they all will look equally puzzling.

This part of the repro looks familiar:

r1 = bpf$MAP_CREATE(0x0, &(0x7f0000002e40)={0x12, 0x0, 0x4, 0x6e, 0x0,
0x1}, 0x68)
bpf$MAP_UPDATE_ELEM(0x2, &(0x7f0000000180)={r1, &(0x7f0000000000),
&(0x7f0000000140)}, 0x20)

We had exactly such consequences of a bug in bpf map very recently,
but that was claimed to be fixed. Maybe not completely?
+bpf maintainers

Looks like syzbot found this in Linus tree with HEAD commit 5b394b2ddf03 ("Linux 4.19-rc1")
one day later net PR got merged via 050cdc6c9501 ("Merge git://git.kernel.org/pub/...").

This PR contained a couple of fixes I did on sockmap code during audit such as:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b845c898b2f1ea458d5453f0fa1da6e2dfce3bb4

Looking at the reproducer syzkaller found it contains:

  r1 = bpf$MAP_CREATE(0x0, &(0x7f0000002e40)={0x12, 0x0, 0x4, 0x6e, 0x0, 0x1}, 0x68)
                                                    ^^^

So it found the crash with map type of sock hash and key size of 0x0 (which is invalid),
where subsequent map update triggered the corruption. I just did a 'syz test' and it
wasn't able to trigger the crash anymore.

#syz fix: bpf, sockmap: fix sock_hash_alloc and reject zero-sized keys


This crash looks related:
https://groups.google.com/d/msg/syzkaller-bugs/luviyHUQ9N4/dmgK2OmLBAAJ

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help