Re: [PATCH] builtin/pack-objects: don't fetch objects when merging packs
From: Junio C Hamano <hidden>
Date: 2026-02-11 17:21:19
Patrick Steinhardt [off-list ref] writes:
The "--stdin-packs" option can be used to merge objects from multiple packfiles given via stdin into a new packfile. One big upside of this option is that we don't have to perform a complete rev walk to enumerate objects. Instead, we can simply enumerate all objects that are part of the specified packfiles, which can be significantly faster in very large repositories. There is one downside though: when we don't perform a rev walk we also don't have a good way to learn about the respective object's names. As a consequence, we cannot use the name hashes as a heuristic to get better delta selection. We try to offset this downside though by performing a localized rev walk: we queue all objects that we're about to repack as interesting, and all objects from excluded packfiles as uninteresting. We then perform a best-effort rev walk that allows us to fill in object names. There is one gotcha here though: when "--exclude-promisor-objects" has not been given we will perform backfill fetches for any promised objects that are missing. This used to not be an issue though as this option was mutually exclusive with "--stdin-packs". But that has changed recently, and starting with dcc9c7ef47 (builtin/repack: handle promisor packs with geometric repacking, 2026-01-05) we will now repack promisor packs during geometric compaction. The consequence is that a geometric repack may now perform a bunch of backfill fetches. We of course cannot passe "--exclude-promisor-objects" to fix this issue -- after all, the whole intent is to repack objects part of a promisor pack. But arguably we don't have to: the rev walk is intended as best effort, and we already configure it to ignore missing links to other objects. So we can adapt the walk to unconditionally disable fetching any missing objects.
"passe" -> "pass". Other than that, very nicely described, and the implementation is surprisingly simple (thanks to a single global variable, and asumption that makes it safe to use such a single global variable, i.e., there is just one packing operation running at a time). Will queue. Thanks.