Thread (18 messages) 18 messages, 4 authors, 2022-08-04

Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

From: Eric Biggers <ebiggers@kernel.org>
Date: 2022-08-03 18:17:42
Also in: linux-kernel-mentees, lkml

On Wed, Aug 03, 2022 at 02:10:06PM +0530, Siddh Raman Pant wrote:
On Wed, 03 Aug 2022 11:11:31 +0530  Eric Biggers [off-list ref] wrote:
quoted
I tested the syzbot reproducer
https://syzkaller.appspot.com/text?tag=ReproC&x=174ea97e080000, and it does
*not* trigger the bug on the latest upstream.  But, it does trigger the bug if I
recent Linus's recent watch_queue fixes.

So I don't currently see any evidence of an unfixed bug.  If, nevertheless, you
believe that a bug is still unfixed, then please provide a reproducer and a fix
patch that clearly explains what it is fixing. 
quoted
There is a null check in post_one_notification for the pipe, most probably
because it *expects* the pointer to be NULL'd. Also, there is no reason to have
a dangling pointer stay, it's just a recipe for further bugs.
If you want to send a patch or patches to clean up the code, that is fine, but
please make it super clear what is a cleanup and what is a fix.

- Eric
I honestly feel like I am repeating myself yet again, but okay.
Well, you should try listening instead.  Because you are not listening.
Of course, the race condition has been solved by a patch upstream, which I had
myself mentioned earlier.

But what I am saying is that it did *not* address *what* that race condition
had triggered, i.e. the visible cause of the UAF crash, which, among other
things, is *because* there is a dangling pointer to the freed pipe, which
*caused* the crash in post_one_notification() when it tried to access
&pipe->rd.wait_lock as an argument to spin_lock_irq(), a path it reached
after checking if wqueue->pipe is NULL and proceeded when it was not the case.

And the upstream commit was made *after* I had posted this patch, hence this
was a fix for the syzkaller issue. While I am *not* saying to accept it just
because this was posted earlier, I am saying this patch addresses a parallel
issue, i.e. the *actual use-after-free crash* which was reproduced by those
reproducers, i.e., what was attempted to be used after getting freed and
detected by KASAN.
Even if wqueue->pipe was set to NULL during free_pipe_info(), there would still
have been a use-after-free, as the real bug was the lack of synchronization
between post_one_notification() and free_pipe_info().  That is fixed now.
We don't need to wait for another similar syzbot report to pop up before doing
this change, and say let's not fix a dangling pointer reference because now
another commit apparately fixes the specific syzkaller issue, causing the given
specific reproducer with its specific way of reproducing to fail, when we in
fact now know it *can* be a valid problem in practice and doing this change
too causes the specific reproducer under consideration to fail reproducing, as
was reported by the reproducer itself.
To re-iterate, I encourage you to send a cleanup patch if you see an
opportunity.  It looks like the state wqueue->defunct==true could be replaced
with wqueue->pipe==NULL, which would be simpler, so how about doing that?  Just
don't claim that it is "fixing" something, unless it is, as that makes things
very confusing and difficult for everyone.
I really don't know how to create stress tests / reproducers like how syzkaller
makes, so if a similar new reproducer is really required for showing this
patch's validity disregarding any earlier reproducers, I unfortunately cannot
make it due to skill issue as I just started in kernel dev, and I am deeply
sorry for wasting the time of everyone, and I am thankful for your criticism of
my patch.
A reproducer can just be written as a normal program, in C or another language.
The syzkaller reproducers are really hard to read as they are auto-generated, so
don't read too much into them -- they're certainly not examples of good code.

- Eric
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help