Thread (8 messages) 8 messages, 4 authors, 2018-08-23

Re: Odd SIGSEGV issue introduced by commit 6b31d5955cb29 ("mm, oom: fix potential data corruption when oom_reaper races with writer")

From: Christophe LEROY <hidden>
Date: 2018-08-22 08:19:05
Also in: linux-mm


Le 21/08/2018 à 19:50, Ram Pai a écrit :
On Tue, Aug 21, 2018 at 04:40:15PM +1000, Michael Ellerman wrote:
quoted
Christophe LEROY [off-list ref] writes:
...
quoted
And I bisected its disappearance with commit 99cd1302327a2 ("powerpc:
Deliver SEGV signal on pkey violation")
Whoa that's weird.
quoted
Looking at those two commits, especially the one which makes it
dissapear, I'm quite sceptic. Any idea on what could be the cause and/or
how to investigate further ?
Are you sure it's not some corruption that just happens to be masked by
that commit? I can't see anything in that commit that could explain that
change in behaviour.

The only real change is if you're hitting DSISR_KEYFAULT isn't it?
even with the 'commit 99cd1302327a2', a SEGV signal should get generated;
which should kill the process. Unless the process handles SEGV signals
with SEGV_PKUERR differently.
No, the sigsegv are not handled differently. And the trace shown it is 
SEGV_MAPERR which is generated.
The other surprising thing is, why is DSISR_KEYFAULT getting generated
in the first place?  Are keys somehow getting programmed into the HPTE?
Can't be that, because DSISR_KEYFAULT is filtered out when applying 
DSISR_SRR1_MATCH_32S mask.
Feels like some random corruption.
In a way yes, except that it is always at the same instruction (in 
ld.so) and always because the accessed address is 0x67xxxxxx instead of 
0x77xxxxxx
I also tested with TASK_SIZE set to 0xa0000000 instead of 0x80000000, 
and I get same failure with bad address being 0x87xxxxxx instead of 
0x97xxxxxx

Christophe
Is this behavior seen with power8 or power9?

RP
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help