Re: [RFC PATCH v1 00/13] exec: add spawn templates for repeated executable startup
From: John Ericson <hidden>
Date: 2026-06-09 17:28:14
Also in:
linux-arch, linux-doc, linux-fsdevel, linux-kselftest, linux-mm, lkml
On Tue, Jun 9, 2026, at 10:43 AM, Li Chen wrote:
Hi Andy, ---- On Tue, 09 Jun 2026 08:01:57 +0800 Andy Lutomirski [off-list ref] wrote ---quoted
[...] After contemplating this for a bit... why pidfd? Doesn't a pidfd refer to an actual process that is, or at least was, running? This new thing is a process that we are contemplating spawning. I can imagine that basically all pidfd APIs would be a bit confused by the nonexistence of the process in question.Yes, I think that is a real concern. In my current local WIP I tried to keep that distinction explicit. pidfd_spawn_open() returns a pidfs-backed builder fd, not a normal pidfd referring to a process. The builder fd is allocated as an anonymous pidfs file with builder-specific file operations: file = pidfs_alloc_anon_file("[pidfd_spawn]", &pidfd_spawn_builder_fops, builder, O_RDWR);
What does your builder fd point to, explicitly? For example in my other reply I talked about how it was "real" process state. In my FreeBSD patch, for example, I found there was already a status for a process "in exec", and I figured that was clean to reuse for one of these "embryonic" processes that also hadn't started running. I would reckon that Linux probably has some similar notions.
and the normal pidfd helpers still reject it because it does not use the
ordinary pidfd file operations:
struct pid *pidfd_pid(const struct file *file)
{
if (file->f_op != &pidfs_file_operations)
return ERR_PTR(-EBADF);
return file_inode(file)->i_private;
}
So the current split is:
builder_fd = pidfd_spawn_open(...); /* builder object */
pidfd_config(builder_fd, ...);
child_pidfd = pidfd_spawn_run(builder_fd, ...); /* real pidfd */
Only the last fd is a normal pidfd for an actual child process. The builder
fd is only accepted by the builder operations.
This avoids having to define what waitid(P_PIDFD), pidfd_send_signal(),
pidfd_getfd(), poll(), etc. mean before the process exists.I wouldn't be so sure this is necessary/good. For example, I think it could make sense to wait on a process that has yet to be started; one just waits for both the process to start and the process to exit. Obviously a blocking syscall in the thread that is spawning the process is not useful, but the asynchronous poll variation seems fine. As long as there is real process state here, it shouldn't be too hard to implement.
The downside is that it adds a separate open-style entry point and is less uniform than the pidfd_open(0, PIDFD_EMPTY) spelling Christian sketched.
I do think there is no point having two file descriptors. The file descriptor that previously referred to the builder/embryonic process then can refer to the real process, right?
If people think there is a better way to represent the pre-spawn builder state, or if the preference is to integrate it directly into pidfd_open() with an explicit empty/future-pidfd state, I would be happy to discuss that.
Hope the above answers your question? I suppose my ideas lean more on the "future" than "empty" side --- there is indeed a thread in the thread group, with real VM/namespace/file descriptor etc. state. Moreover, state gets initialized before the process is started, so the actual start is a pretty lightweight step of just letting the scheduler know the now-ready process can be scheduled. The only thing that distinguishes the embryonic process from a real one is simply that it isn't running --- i.e. isn't (yet) available to be scheduled --- so the pidfds holders are free to poke at its state. Cheers, John