Re: [bug] git clone command leaves orphaned ssh process
From: Max Amelchenko <hidden>
Date: 2023-09-24 10:25:24
Thanks, Just wanted to clarify something. This will not be handled by AWS (we had a support ticket re. that case), since they do not interfere with the running processes on its infrastructure, and if there is a problematic process causing this overflowing in orphaned processes, it needs to be handled by that process. The question is, doesn't Git want to ensure a clean exit in all cases? This is a clear example of a non-clean exit. On Tue, Sep 12, 2023 at 7:33 AM Jeff King [off-list ref] wrote:
On Mon, Sep 11, 2023 at 08:40:49PM -0400, Aaron Schrab wrote:quoted
At 13:11 +0300 11 Sep 2023, Max Amelchenko [off-list ref] wrote:quoted
Maybe it's connected also to the underlying infrastructure? We are getting this in AWS lambda jobs and we're hitting a system limit of max processes because of it.Running as a lambda, or in a container, could definitely be why you're seeing a difference. Normally when a process is orphaned it gets adopted by `init` (PID 1), and that will take care of cleaning up after orphaned zombie processes. But most of the time containers just run the configured process directly, without an init process. That leaves nothing to clean orphan processes.Yeah, that seems like the culprit. If the clone finishes successfully, we do end up in finish_connect(), where we wait() for the process. But if we exit early (in this case, ssh bails and we get EOF on the pipe reading from it), then we may call die() and exit immediately. We _could_ take special care to add every spawned process to a global list, set up handlers via atexit() and signal(), and then reap the processes. But traditionally it's not a big deal to exit with un-reaped children, and this is the responsibility of init. I'm not sure it makes sense for Git to basically reimplement that catch-all (and of course we cannot even do it reliably if we are killed by certain signals).quoted
Although for that to really be a problem, would require hitting that max process limit inside a single container invocation. Of course since containers usually aren't meant to be spawning a lot of processes, that limit might be a lot lower than on a normal system. I know that Docker provides a way to include an init process in the started container (`docker run --init`), but I don't think that AWS Lambda does.I don't know anything about Lambda, but if you are running arbitrary commands, then it seems like you could insert something like this: https://github.com/krallin/tini into the mix. I much prefer that to teaching Git to try to do the same thing in-process. -Peff