Thread (4 messages) 4 messages, 2 authors, 2017-11-24

v4.14-rc{4,7} null pointer dereference in event_sched_out()

From: mark.rutland@arm.com (Mark Rutland)
Date: 2017-11-24 18:16:34
Also in: lkml

On Fri, Nov 24, 2017 at 06:10:56PM +0000, Mark Rutland wrote:
On Wed, Nov 15, 2017 at 06:00:20PM +0000, Will Deacon wrote:
quoted
On Mon, Oct 30, 2017 at 04:23:15PM +0000, Mark Rutland wrote:
quoted
As a heads-up, while fuzzing arm64 v4.14-rc{4,7} with Syzkaller, I hit a
KASAN splat in event_sched_out():
Did you get anywhere with this?
I got a *bit* further, but I haven't figured out the underlying issue
yet.
Forgot to mention, the above all applies to a vanilla v4.14 arm64
kernel; defconfig + KASAN_INLINE.

Thanks,
Mark.
I minimized the reproducer down to the following:

----
# {Threaded:true Collide:true Repeat:true Procs:1 Sandbox:none Fault:false FaultCall:-1 FaultNth:0 EnableTun:true UseTmpDir:true HandleSegv:true WaitRepeat:true Debug:false Repro:false}

r2 = gettid()
mmap(&(0x7f0000000000/0xd3f000)=nil, 0xd3f000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
r0 = perf_event_open(&(0x7f0000d15000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x9, 0x30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, r2, 0xffffffff, 0xffffffffffffffff, 0x0)
mmap(&(0x7f0000d3f000/0x1000)=nil, 0x1000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
r1 = perf_event_open(&(0x7f0000d15000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, r2, 0xffffffff, r0, 0x0)
dup3(0, 0, 0)
perf_event_open(&(0x7f0000b13000-0x78)={0x0, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, r2, 0xffffffff, r0, 0x0)
----

Note: the dup3() is an expensive NOP (since oldfd == newfd), but I think
it's triggering an interesting scheduling pattern, since thus far I
haven't managed to trigger the bug without it.

That creates a perf_cpu_clock event, adds another to that group, and
adds a HW event to that same group. In parallel.

Sometimes at the point the HW event is added, the leading SW event is in
PERF_EVENT_STATE_INACTIVE, but the follower SW event is in
PERF_EVENT_STATE_ACTIVE. The context both are held in is inactive, so
the follower event's state makes no sense.

I added a dump to event_sched_out() that catches this:

[   35.995144] Uh-oh:
[   35.995144]   event ffff800039a1f880
[   35.995144]   event->state 1
[   35.995144]   event->cpu -1
[   35.995144]   pmu ffff20000a3b2600 (perf_cpu_clock, AKA (null))
[   35.995144]   leader ffff800039a1a480
[   35.995144]   leader->state -1
[   35.995144]   pmu ffff20000a3b2600 (perf_cpu_clock, AKA (null))
[   35.995144]   ctx ffff80003932e180, pmu ffff20000a3b2600 (perf_cpu_clock AKA (null))

I'll try to dig into this a bit more next week.

I can't reproduce this with Syzkaller running in a single thread, nor
with some multi-threaded tests I wrote in C, so I guess there's a subtle
race I'm not managing to hit.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel at lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help