Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue
From: Siddh Raman Pant <hidden>
Date: 2022-08-03 08:40:55
Also in:
linux-kernel-mentees, lkml
On Wed, 03 Aug 2022 11:11:31 +0530 Eric Biggers [off-list ref] wrote:
I tested the syzbot reproducer https://syzkaller.appspot.com/text?tag=ReproC&x=174ea97e080000, and it does *not* trigger the bug on the latest upstream. But, it does trigger the bug if I recent Linus's recent watch_queue fixes. So I don't currently see any evidence of an unfixed bug. If, nevertheless, you believe that a bug is still unfixed, then please provide a reproducer and a fix patch that clearly explains what it is fixing.quoted
There is a null check in post_one_notification for the pipe, most probably because it *expects* the pointer to be NULL'd. Also, there is no reason to have a dangling pointer stay, it's just a recipe for further bugs.If you want to send a patch or patches to clean up the code, that is fine, but please make it super clear what is a cleanup and what is a fix. - Eric
I honestly feel like I am repeating myself yet again, but okay. Of course, the race condition has been solved by a patch upstream, which I had myself mentioned earlier. But what I am saying is that it did *not* address *what* that race condition had triggered, i.e. the visible cause of the UAF crash, which, among other things, is *because* there is a dangling pointer to the freed pipe, which *caused* the crash in post_one_notification() when it tried to access &pipe->rd.wait_lock as an argument to spin_lock_irq(), a path it reached after checking if wqueue->pipe is NULL and proceeded when it was not the case. And the upstream commit was made *after* I had posted this patch, hence this was a fix for the syzkaller issue. While I am *not* saying to accept it just because this was posted earlier, I am saying this patch addresses a parallel issue, i.e. the *actual use-after-free crash* which was reproduced by those reproducers, i.e., what was attempted to be used after getting freed and detected by KASAN. We don't need to wait for another similar syzbot report to pop up before doing this change, and say let's not fix a dangling pointer reference because now another commit apparately fixes the specific syzkaller issue, causing the given specific reproducer with its specific way of reproducing to fail, when we in fact now know it *can* be a valid problem in practice and doing this change too causes the specific reproducer under consideration to fail reproducing, as was reported by the reproducer itself. I really don't know how to create stress tests / reproducers like how syzkaller makes, so if a similar new reproducer is really required for showing this patch's validity disregarding any earlier reproducers, I unfortunately cannot make it due to skill issue as I just started in kernel dev, and I am deeply sorry for wasting the time of everyone, and I am thankful for your criticism of my patch. Thanks, Siddh