gcc 4.6.3 miscompile on ppc32 (was Re: Regression in kernel 4.12-rc1 for... | linuxppc-dev

gcc 4.6.3 miscompile on ppc32 (was Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3)

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: 2017-06-25 20:53:30
Also in: lkml

On Sun, Jun 25, 2017 at 12:14:04PM +0100, Al Viro wrote:

On Sun, Jun 25, 2017 at 10:53:58AM +0100, Al Viro wrote:

quoted

On Sat, Jun 24, 2017 at 12:29:23PM -0500, Larry Finger wrote:

quoted

I made a break through. If I turn off inline copy to/from users for 32-bit
ppc with the following patch, then the system boots:

OK...  So it's 4.6.3 miscompiling something - it is hardware-independent,
reproduced in qemu.  I'd like to get more self-contained example of
miscompile, though; should be done by tonight...

OK, it's the call in rw_copy_check_uvector(); with INLINE_COPY_FROM_USER
it's miscompiled by 4.6.3.  I hadn't looked through the generated code
yet; will do that after I grab some sleep.

Confirmed.  It manages to bugger the loop immediately after the (successful)
copying of iovec array in rw_copy_check_uvector(); both with and without
INLINE_COPY_FROM_USER it has (just before the call of copy_from_user()) r27
set to nr_segs * sizeof(struct iovec).  The call is made, we check that it
has succeeded and that's when it hits the fan: without INLINE_COPY_FROM_USER
we have (interleaved with unrelated insns)
        addi 27,27,-8
        srwi 27,27,3
        addi 27,27,1
        mtctr 27
Weird, but manages to pass nr_segs to mtctr.  _With_ INLINE_COPY_FROM_USER we
get this:
        lis 9,0x2000
        mtctr 9
In other words, the loop will try to go through 8192 iterations.  No idea where
that number has come from, but it sure as hell is wrong.  That's where those
-EINVAL, etc. are coming from - we run into something negative in iov[seg].len,
after having run out of on-stack iovec array.

	Assembler generated out of rw_copy_check_uvector() with and without
INLINE_COPY_FROM_USER is attached; it's a definite miscompile.  Neither 4.4.5
nor 6.3.0 use mtctr/bdnz for that loop.

	The bottom line is, ppc cross-toolchain on kernel.org happens to be
the version that miscompiles rw_copy_check_uvector() with INLINE_COPY_FROM_USER
and hell knows what else.  Said that, I would rather have ppc32 drop the
INLINE_COPY_{TO,FROM}_USER anyway; that won't fix any other places where
the same 4.6.3 bug hits, but I seriously suspect that it will end up being
faster even on non^Wless buggy gcc versions.  Could powerpc folks check
what does removing those two defines from arch/powerpc/include/asm/uaccess.h
do to performance?  If there's no slowdown, I would strongly recommend just
removing those as in the patch Larry has posted upthread.

	Fixing whatever it is in gcc 4.6.3 that triggers that behaviour is
IMO pointless - it might make sense to switch kernel.org cross-toolchain to
something more recent, but that's it.

Attachments

inlined.s [text/plain] 3342 bytes · preview
uninlined.s [text/plain] 2872 bytes · preview

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help