Re: [PATCH] Introduce the pkill_on_warn boot parameter

From: Alexander Popov <hidden>
Date: 2021-10-06 14:57:05
Also in: linux-hardening, lkml

Possibly related (same subject, not in this thread)

2022-07-27 · Re: [PATCH] Introduce the pkill_on_warn boot parameter · Alexey Khoroshilov <hidden>
2022-07-27 · Re: [PATCH] Introduce the pkill_on_warn boot parameter · Alexey Khoroshilov <hidden>
2022-07-27 · Re: [PATCH] Introduce the pkill_on_warn boot parameter · Linus Torvalds <torvalds@linux-foundation.org>
2022-07-27 · Re: [PATCH] Introduce the pkill_on_warn boot parameter · Jann Horn <jannh@google.com>
2022-07-27 · Re: [PATCH] Introduce the pkill_on_warn boot parameter · Alexey Khoroshilov <hidden>

On 05.10.2021 22:48, Eric W. Biederman wrote:

Alexander Popov [off-list ref] writes:

quoted

On 02.10.2021 19:52, Linus Torvalds wrote:

quoted

On Sat, Oct 2, 2021 at 4:41 AM Alexander Popov [off-list ref] wrote:

quoted

And what do you think about the proposed pkill_on_warn?

Honestly, I don't see the point.

If you can reliably trigger the WARN_ON some way, you can probably
cause more problems by fooling some other process to trigger it.

And if it's unintentional, then what does the signal help?

So rather than a "rationale" that makes little sense, I'd like to hear
of an actual _use_ case. That's different. That's somebody actually
_using_ that pkill to good effect for some particular load.

I was thinking about a use case for you and got an insight.

Bugs usually don't come alone. Killing the process that got WARN_ON() prevents
possible bad effects **after** the warning. For example, in my exploit for
CVE-2019-18683, the kernel warning happens **before** the memory corruption
(use-after-free in the V4L2 subsystem).
https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html

So pkill_on_warn allows the kernel to stop the process when the first signs of
wrong behavior are detected. In other words, proceeding with the code execution
from the wrong state can bring more disasters later.

quoted

That said, I don't much care in the end. But it sounds like a
pointless option to just introduce yet another behavior to something
that should never happen anyway, and where the actual
honest-to-goodness reason for WARN_ON() existing is already being
fulfilled (ie syzbot has been very effective at flushing things like
that out).

Yes, we slowly get rid of kernel warnings.
However, the syzbot dashboard still shows a lot of them.
Even my small syzkaller setup finds plenty of new warnings.
I believe fixing all of them will take some time.
And during that time, pkill_on_warn may be a better reaction to WARN_ON() than
ignoring and proceeding with the execution.

Is that reasonable?

I won't comment on the sanity of the feature but I will say that calling
it oops_on_warn (rather than pkill_on_warn), and using the usual oops
facilities rather than rolling oops by hand sounds like a better
implementation.

Especially as calling do_group_exit(SIGKILL) from a random location is
not a clean way to kill a process.  Strictly speaking it is not even
killing the process.

Partly this is just me seeing the introduction of a
do_group_exit(SIGKILL) call and not likely the maintenance that will be
needed.  I am still sorting out the problems with other randomly placed
calls to do_group_exit(SIGKILL) and interactions with ptrace and
PTRACE_EVENT_EXIT in particular.

Which is a long winded way of saying if I can predictably trigger a
warning that calls do_group_exit(SIGKILL), on some architectures I can
use ptrace and  can convert that warning into a way to manipulate the
kernel stack to have the contents of my choice.

If anyone goes forward with this please use the existing oops
infrastructure so the ptrace interactions and anything else that comes
up only needs to be fixed once.

Eric, thanks a lot.

I will learn the oops infrastructure deeper.
I will do more experiments and come with version 2.

Currently, I think I will save the pkill_on_warn option name because I want to
avoid kernel crashes.

Thanks to everyone who gave feedback on this patch!

Best regards,
Alexander

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help