Re: BUG: unable to handle kernel paging request from pty_write [was: Linux 4.4.2]
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: 2016-02-26 00:38:16
Also in:
lkml
On Thu, Feb 25, 2016 at 1:32 PM, Jiri Slaby [off-list ref] wrote:
Interestingly, RBP contains address inside try_to_wake_up -- ffffffff810a535a (dunno why) which is: ffffffff810a5355: e8 66 a0 ff ff callq ffffffff8109f3c0 <ttwu_stat> ffffffff810a535a: e9 9d fe ff ff jmpq ffffffff810a51fc <try_to_wake_up+0x3c> ttwu_stat does in the begginning: mov $0x16e80,%r14 which is what we actually still have in r14 when it crashes. The first ttwu_stat's "if" has to go through the true branch (otherwise r14 would be overwritten).
Hmm. That does sound very much like it might be ttwu_stat() that has
gotten the stack frame wrong, and when finishes exits, it does
popq %rbp
ret
but in fact it popped the return address, and then returned to a crazy address.
Which sounds like a corrupted stack pointer (not a corrupted stack).
Can you make just the "vmlinux" file available somewhere?
In my own private configuration, ttwu_stat() doesn't actually touch
the stack at all - no stack pointer action anywhere except for the
ttwu_stat:
1: call __fentry__
pushq %rbp
..
movq %rsp, %rbp #,
.....
popq %rbp
ret
but yeah, as Peter says, maybe an exception screwed up %rsp somehow..
I really don't see how it would happen here - that code doesn't look
particularly odd.
And the fentry code used by the function tracer can certainly screw
things up, but even that would be hard-pressed to screw up %rbp, since
the saving of rbp comes *after* fentry. Old pre-__fentry__ gcc
versions had a much higher likelihood (the whole mcount thing is a
disaster, but I'm assuming you have a compiler that does __fentry__
and have CC_USING_FENTRY set?)
Linus