Re: [PATCH v7 5/9] pidfs, coredump: add PIDFD_INFO_COREDUMP
From: Jann Horn <jannh@google.com>
Date: 2025-05-15 21:38:16
Also in:
linux-fsdevel, linux-security-module, lkml
On Thu, May 15, 2025 at 10:56 PM Jann Horn [off-list ref] wrote:
On Thu, May 15, 2025 at 12:04 AM Christian Brauner [off-list ref] wrote:quoted
Extend the PIDFD_INFO_COREDUMP ioctl() with the new PIDFD_INFO_COREDUMP mask flag. This adds the fields @coredump_mask and @coredump_cookie to struct pidfd_info.FWIW, now that you're using path-based sockets and override_creds(), one option may be to drop this patch and say "if you don't want untrusted processes to directly connect to the coredumping socket, just set the listening socket to mode 0000 or mode 0600"...
Er, forget I said that, of course we'd still want to have at least the @coredump_mask.
quoted
Signed-off-by: Christian Brauner <brauner@kernel.org>[...]quoted
diff --git a/fs/coredump.c b/fs/coredump.c index e1256ebb89c1..bfc4a32f737c 100644 --- a/fs/coredump.c +++ b/fs/coredump.c[...]quoted
@@ -876,8 +880,34 @@ void do_coredump(const kernel_siginfo_t *siginfo) goto close_fail; } + /* + * Set the thread-group leader pid which is used for the + * peer credentials during connect() below. Then + * immediately register it in pidfs... + */ + cprm.pid = task_tgid(current); + retval = pidfs_register_pid(cprm.pid); + if (retval) { + sock_release(socket); + goto close_fail; + } + + /* + * ... and set the coredump information so userspace + * has it available after connect()... + */ + pidfs_coredump(&cprm); + + /* + * ... On connect() the peer credentials are recorded + * and @cprm.pid registered in pidfs...I don't understand this comment. Wasn't "@cprm.pid registered in pidfs" above with the explicit `pidfs_register_pid(cprm.pid)`?quoted
+ */ retval = kernel_connect(socket, (struct sockaddr *)(&addr), addr_len, O_NONBLOCK | SOCK_COREDUMP); + + /* ... So we can safely put our pidfs reference now... */ + pidfs_put_pid(cprm.pid);Why can we safely put the pidfs reference now but couldn't do it before the kernel_connect()? Does the kernel_connect() look up this pidfs entry by calling something like pidfs_alloc_file()? Or does that only happen later on, when the peer does getsockopt(SO_PEERPIDFD)?quoted
if (retval) { if (retval == -EAGAIN) coredump_report_failure("Coredump socket %s receive queue full", addr.sun_path);[...]quoted
diff --git a/fs/pidfs.c b/fs/pidfs.c index 3b39e471840b..d7b9a0dd2db6 100644 --- a/fs/pidfs.c +++ b/fs/pidfs.c[...]quoted
@@ -280,6 +299,13 @@ static long pidfd_info(struct file *file, unsigned int cmd, unsigned long arg) } } + if (mask & PIDFD_INFO_COREDUMP) { + kinfo.mask |= PIDFD_INFO_COREDUMP; + smp_rmb();I assume I would regret it if I asked what these barriers are for, because the answer is something terrifying about how we otherwise don't have a guarantee that memory accesses can't be reordered between multiple subsequent syscalls or something like that? checkpatch complains about the lack of comments on these memory barriers.quoted
+ kinfo.coredump_cookie = READ_ONCE(pidfs_i(inode)->__pei.coredump_cookie); + kinfo.coredump_mask = READ_ONCE(pidfs_i(inode)->__pei.coredump_mask); + } + task = get_pid_task(pid, PIDTYPE_PID); if (!task) { /*[...]quoted
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index a9d1c9ba2961..053d2e48e918 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c[...]quoted
@@ -742,6 +743,7 @@ static void unix_release_sock(struct sock *sk, int embrion) struct unix_peercred { struct pid *peer_pid; + u64 cookie;Maybe add a comment here documenting that for now, this is assumed to be used exclusively for coredump sockets.quoted
const struct cred *peer_cred; };