Re: [PATCH] powerpc/tm: Fix userspace r13 corruption

From: Breno Leitao <leitao@debian.org>
Date: 2018-09-25 14:32:40

Hi Mikey,

On 09/25/2018 02:24 AM, Michael Neuling wrote:

On Mon, 2018-09-24 at 11:32 -0300, Breno Leitao wrote:

quoted

Hi Mikey,

On 09/24/2018 04:27 AM, Michael Neuling wrote:

quoted

When we treclaim we store the userspace checkpointed r13 to a scratch
SPR and then later save the scratch SPR to the user thread struct.

Unfortunately, this doesn't work as accessing the user thread struct
can take an SLB fault and the SLB fault handler will write the same
scratch SPRG that now contains the userspace r13.

To fix this, we store r13 to the kernel stack (which can't fault)
before we access the user thread struct.

Found by running P8 guest + powervm + disable_1tb_segments + TM. Seen
as a random userspace segfault with r13 looking like a kernel address.

Signed-off-by: Michael Neuling <redacted>

Reviewed-by: Breno Leitao <leitao@debian.org>

I am wondering if the same problem could not happen with r1 as well, since r1
is kept in the paca->tm_scratch after MSR[RI] is enabled, in a code similar
to this:

        std     r1, PACATMSCRATCH(r13)
	..
        li      r11, MSR_RI
	mtmsrd  r11, 1
	....
        ld      r3, PACATMSCRATCH(r13)          /* user r1 */
        std     r3, GPR1(r7)

There might be a corruption if the exception that is raised just after
MSR[RI] is enabled somehow exits through 'fast_exception_return' (being
called by most of the exception handlers as I understand) which will
overwrite paca->tm_scratch with the upcoming MSR (aka SRR1) just before the
SRR1, as:

	ld      r3,_MSR(r1)

	...
	std     r3, PACATMSCRATCH(r13) /* Stash returned-to MSR */


It seems that slb_miss_common and instruction_access_slb does not go through
this path, but handle_page_fault calls
ret_from_except_lite->fast_exception_return, which might cause this
'possible' issue.

Yeah, good catch! I think we are ok but it's delicate. 

I'll store it in the kernel stack and avoid PACATMSCRATCH before I turn RI on.  

This look ok?

Yes, That is exactly what I though when I looked at the code. Thanks for
fixing it.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help