Thread (72 messages) 72 messages, 11 authors, 2020-11-02

Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

From: Arnd Bergmann <arnd@arndb.de>
Date: 2020-10-22 09:14:59
Also in: linux-mm

On Thu, Oct 22, 2020 at 10:26 AM Greg KH [off-list ref] wrote:
On Thu, Oct 22, 2020 at 12:39:14AM +0100, Al Viro wrote:
quoted
On Wed, Oct 21, 2020 at 06:13:01PM +0200, Greg KH wrote:
quoted
On Fri, Sep 25, 2020 at 06:51:39AM +0200, Christoph Hellwig wrote:

I can't really figure out what the regression is, only that this commit
triggers a "large Android system binary" from working properly.  There's
no kernel log messages anywhere, and I don't have any way to strace the
thing in the testing framework, so any hints that people can provide
would be most appreciated.
It's a pure move - modulo changed line breaks in the argument lists
the functions involved are identical before and after that (just checked
that directly, by checking out the trees before and after, extracting two
functions in question from fs/read_write.c and lib/iov_iter.c (before and
after, resp.) and checking the diff between those.

How certain is your bisection?
The bisection is very reproducable.

But, this looks now to be a compiler bug.  I'm using the latest version
of clang and if I put "noinline" at the front of the function,
everything works.

Nick, any ideas here as to who I should report this to?

I'll work on a fixup patch for the Android kernel tree to see if I can
work around it there, but others will hit this in Linus's tree sooner or
later...
I see that Christoph rewrote the function again in bfdc59701d6d
("iov_iter: refactor rw_copy_check_uvector and import_iovec"),
do you know if the current mainline version is also affected?

Do you know if it happens across multiple architectures or might
be specific to either x86 or arm64?

https://bugs.llvm.org/ is obviously the place for reporting the
issue if it does turn out to be a bug in clang, but that requires
a specific thing going wrong in the output.

One idea I have for debugging it is to sprinkle the inlined
function with lots of barrier()s to prevent a lot of the optimizations.
If that solves the issue, you can bisect through those until you
find one barrier that makes the difference between working and
broken and then look at diff of the assembler output.

        Arnd
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help