On Thu, Nov 26, 2020 at 09:04:35PM +0100, René Scharfe wrote:
quoted
quoted
We spawn an external pack-objects process to actually send objects to
the remote side. If we are killed by a signal during this process,
then pack-objects may continue to run. As soon as it starts producing
output for the pack, it will see a failure writing to upload-pack and
exit itself. But before then, it may do significant work traversing
the object graph, compressing deltas, etc, which will all be
pointless. So let's make sure to kill as soon as we know that the
caller will not read the result.
Thanks, that reads well.
The patch is trivial, you don't need my sign-off. You could record Peff
as its author, as he contributed the most to the version in seen.
I didn't want this topic to be forgotten, so here it is with me as the
author, my signoff, and an overview of the reproduction in the commit
message.
(I am perfectly happy for René to be author, but I am mainly interested
in resolving the signoff issue; I agree most of the work was in the
diagnosis, and I did re-type the single line all by myself ;) ).
-- >8 --
Subject: [PATCH] upload-pack: kill pack-objects helper on signal or exit
We spawn an external pack-objects process to actually send objects to
the remote side. If we are killed by a signal during this process, then
pack-objects may continue to run. As soon as it starts producing output
for the pack, it will see a failure writing to upload-pack and exit
itself. But before then, it may do significant work traversing the
object graph, compressing deltas, etc, which will all be pointless. So
let's make sure to kill as soon as we know that the caller will not read
the result.
There's no test here, since it's inherently racy, but here's an easy
reproduction is on a large-ish repo like linux.git:
- make sure you don't have pack bitmaps (since they make the enumerating
phase go quickly). For linux.git it takes ~30s or so to walk the
whole graph on my machine.
- run "git clone --no-local -q . dst"; the "-q" is important because
if pack-objects is writing progress to upload-pack (to get
multiplexed over the sideband to the client), then it will notice
pretty quickly the failure to write to stderr
- kill the client-side clone process in another terminal (don't use
^C, as that will send SIGINT to all of the processes)
- run "ps au | grep git" or similar to observe upload-pack dying
within 5 seconds (it will send a keepalive that will notice the
client has gone away)
- but you'll still see pack-objects consuming 100% CPU (and 1GB+ of
RAM) during the traversal and delta compression phases. It will exit
as soon as it starts to write the pack (when it will notice that
upload-pack went away).
With this patch, pack-objects exits as soon as upload-pack does.
Signed-off-by: Jeff King <redacted>
---
upload-pack.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/upload-pack.c b/upload-pack.c
index 5dc8e1f844..1006bebd50 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -321,6 +321,7 @@ static void create_pack_file(struct upload_pack_data *pack_data,
pack_objects.in = -1;
pack_objects.out = -1;
pack_objects.err = -1;
+ pack_objects.clean_on_exit = 1;
if (start_command(&pack_objects))
die("git upload-pack: unable to fork git-pack-objects");--
2.29.2.893.g57eb4d1d5a