RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight <hidden>
Date: 2020-10-22 09:28:28
Also in:
io-uring, keyrings, linux-arch, linux-arm-kernel, linux-block, linux-fsdevel, linux-mips, linux-mm, linux-s390, linux-scsi, linux-security-module, linuxppc-dev, lkml, sparclinux
From: David Hildenbrand
Sent: 22 October 2020 10:19 On 22.10.20 11:01, Greg KH wrote:quoted
On Thu, Oct 22, 2020 at 10:48:59AM +0200, David Hildenbrand wrote:quoted
On 22.10.20 10:40, David Laight wrote:quoted
From: David Hildenbrandquoted
Sent: 22 October 2020 09:35 On 22.10.20 10:26, Greg KH wrote:quoted
On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:quoted
On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:quoted
On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:quoted
From: David Laight <redacted> This lets the compiler inline it into import_iovec() generating much better code. Signed-off-by: David Laight <redacted> Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/read_write.c | 179 ------------------------------------------------ lib/iov_iter.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 176 insertions(+), 179 deletions(-)Strangely, this commit causes a regression in Linus's tree right now. I can't really figure out what the regression is, only that this commit triggers a "large Android system binary" from working properly. There's no kernel log messages anywhere, and I don't have any way to strace the thing in the testing framework, so any hints that people can provide would be most appreciated.It's a pure move - modulo changed line breaks in the argument lists the functions involved are identical before and after that (just checked that directly, by checking out the trees before and after, extracting two functions in question from fs/read_write.c and lib/iov_iter.c (before and after, resp.) and checking the diff between those. How certain is your bisection?The bisection is very reproducable. But, this looks now to be a compiler bug. I'm using the latest version of clang and if I put "noinline" at the front of the function, everything works.Well, the compiler can do more invasive optimizations when inlining. If you have buggy code that relies on some unspecified behavior, inlining can change the behavior ... but going over that code, there isn't too much action going on. At least nothing screamed at me.Apart from all the optimisations that get rid off the 'pass be reference' parameters and strange conditional tests. Plenty of scope for the compiler getting it wrong. But nothing even vaguely illegal.Not the first time that people blame the compiler to then figure out that something else is wrong ... but maybe this time is different :)I agree, I hate to blame the compiler, that's almost never the real problem, but this one sure "feels" like it. I'm running some more tests, trying to narrow things down as just adding a "noinline" to the function that got moved here doesn't work on Linus's tree at the moment because the function was split into multiple functions. Give me a few hours...I might be wrong but a) import_iovec() uses: - unsigned nr_segs -> int - unsigned fast_segs -> int b) rw_copy_check_uvector() uses: - unsigned long nr_segs -> long - unsigned long fast_seg -> long So when calling rw_copy_check_uvector(), we have to zero-extend the registers used for passing the arguments. That's definitely done when calling the function explicitly. Maybe when inlining something is messed up?
That's also not needed on x86-64 - the high bits get cleared by 32bit writes. But, IIRC, arm64 leaves them unchanged or undefined. I guessing that every array access uses a *(Rx + Ry) addressing mode. So indexing an array even with 'unsigned int' requires an explicit zero-extend on arm64? (x86-64 ends up with an explicit sign extend when indexing an array with 'signed int'.) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)