Re: [PATCH RFC v3 08/10] net, pidfs, coredump: only allow coredumping tasks to connect to coredump socket
From: Jann Horn <jannh@google.com>
Date: 2025-05-06 14:38:28
Also in:
linux-fsdevel, lkml
On Tue, May 6, 2025 at 10:06 AM Christian Brauner [off-list ref] wrote:
On Mon, May 05, 2025 at 09:10:28PM +0200, Jann Horn wrote:quoted
On Mon, May 5, 2025 at 8:41 PM Kuniyuki Iwashima [off-list ref] wrote:quoted
From: Christian Brauner <brauner@kernel.org> Date: Mon, 5 May 2025 16:06:40 +0200quoted
On Mon, May 05, 2025 at 03:08:07PM +0200, Jann Horn wrote:quoted
On Mon, May 5, 2025 at 1:14 PM Christian Brauner [off-list ref] wrote:quoted
Make sure that only tasks that actually coredumped may connect to the coredump socket. This restriction may be loosened later in case userspace processes would like to use it to generate their own coredumps. Though it'd be wiser if userspace just exposed a separate socket for that.This implementation kinda feels a bit fragile to me... I wonder if we could instead have a flag inside the af_unix client socket that says "this is a special client socket for coredumping".Should be easily doable with a sock_flag().This restriction should be applied by BPF LSM.I think we shouldn't allow random userspace processes to connect to the core dump handling service and provide bogus inputs; that unnecessarily increases the risk that a crafted coredump can be used to exploit a bug in the service. So I think it makes sense to enforce this restriction in the kernel. My understanding is that BPF LSM creates fairly tight coupling between userspace and the kernel implementation, and it is kind of unwieldy for userspace. (I imagine the "man 5 core" manpage would get a bit longer and describe more kernel implementation detail if you tried to show how to write a BPF LSM that is capable of detecting unix domain socket connections to a specific address that are not initiated by core dumping.) I would like to keep it possible to implement core userspace functionality in a best-practice way without needing eBPF.quoted
It's hard to loosen such a default restriction as someone might argue that's unexpected and regression.If userspace wants to allow other processes to connect to the core dumping service, that's easy to implement - userspace can listen on a separate address that is not subject to these restrictions.I think Kuniyuki's point is defensible. And I did discuss this with Lennart when I wrote the patch and he didn't see a point in preventing other processes from connecting to the core dump socket. He actually would like this to be possible because there's some userspace programs out there that generate their own coredumps (Python?) and he wanted them to use the general coredump socket to send them to. I just found it more elegant to simply guarantee that only connections are made to that socket come from coredumping tasks. But I should note there are two ways to cleanly handle this in userspace. I had already mentioned the bpf LSM in the contect of rate-limiting in an earlier posting: (1) complex: Use a bpf LSM to intercept the connection request via security_unix_stream_connect() in unix_stream_connect(). The bpf program can simply check: current->signal->core_state and reject any connection if it isn't set to NULL.
I think that would be racy, since zap_threads sets that pointer before ensuring that the other threads under the signal_struct are killed.
The big downside is that bpf (and security) need to be enabled.
Neither is guaranteed and there's quite a few users out there that
don't enable bpf.
(2) simple (and supported in this series):
Userspace accepts a connection. It has to get SO_PEERPIDFD anyway.
It then needs to verify:
struct pidfd_info info = {
info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP,
};
ioctl(pidfd, PIDFD_GET_INFO, &info);
if (!(info.mask & PIDFD_INFO_COREDUMP)) {
// Can't be from a coredumping task so we can close the
// connection without reading.
close(coredump_client_fd);
return;
}
/* This has to be set and is only settable by do_coredump(). */
if (!(info.coredump_mask & PIDFD_COREDUMPED)) {
// Can't be from a coredumping task so we can close the
// connection without reading.
close(coredump_client_fd);
return;
}
// Ok, this is a connection from a task that has coredumped, let's
// handle it.
The crux is that the series guarantees that by the time the
connection is made the info whether the task/thread-group did
coredump is guaranteed to be available via the pidfd.
I think if we document that most coredump servers have to do (2) then
this is fine. But I wouldn't mind a nod from Jann on this.I wouldn't recommend either of these as a way to verify that the data coming over the socket is a core dump generated by the kernel, since they both look racy in that regard. But given that you're saying the initial userspace user wouldn't actually want such a restriction, and that we could later provide a separate way for userspace to check what initiated the connection, I guess this is fine for now.