Thread (34 messages) 34 messages, 3 authors, 2025-10-30

Re: [PATCH 4/8] packfile: fix approximation of object counts

From: Taylor Blau <hidden>
Date: 2025-10-29 22:49:19

On Tue, Oct 28, 2025 at 12:08:34PM +0100, Patrick Steinhardt wrote:
None of these are really game-changing. But it's nice to fix those
issues regardless.
Well explained, thank you.
quoted hunk ↗ jump to hunk
While at it, convert the code to use `repo_for_each_pack()`.
Furthermore, use `odb_prepare_alternates()` instead of explicitly
preparing the packfile store. We really only want to prepare the object
database sources, and `get_multi_pack_index()` already knows to prepare
the packfile store for us.

Helped-by: Taylor Blau [off-list ref]
Signed-off-by: Patrick Steinhardt <redacted>
---
 packfile.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/packfile.c b/packfile.c
index 6aa2ca8ac9e..6722c3b2b88 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1143,16 +1143,16 @@ unsigned long repo_approximate_object_count(struct repository *r)
 		unsigned long count = 0;
 		struct packed_git *p;

-		packfile_store_prepare(r->objects->packfiles);
+		odb_prepare_alternates(r->objects);
I was wondering how this worked, since odb_prepare_alternates() does not
eagerly load the packs belonging to a MIDX, but get_multi_pack_index()
does, so this makes sense.

(Writing this out, I realized that you wrote this as the last sentence
in your patch, which is helpful and I think worth doing.)
 		for (source = r->objects->sources; source; source = source->next) {
 			struct multi_pack_index *m = get_multi_pack_index(source);
 			if (m)
-				count += m->num_objects;
+				count += m->num_objects + m->num_objects_in_base;
Oops. This fix is definitely right, thanks for spotting and fixing it.

As a general aside, I expect that we're going to find some more of
these. I tried my best to audit all the places where we use
m->num_objects and m->num_packs, but without having great infrastructure
that encourages the use of MIDX chains, most of this code is all dead
anyway.

Hopefully soon we will see some more usage of MIDX chains with the
incremental repacking work that I've been sending patches for recently.
I'm sure that will flush out more of these issues.
-		for (p = r->objects->packfiles->packs; p; p = p->next) {
-			if (open_pack_index(p))
+		repo_for_each_pack(r, p) {
+			if (open_pack_index(p) || p->multi_pack_index)
Do we care about opening the pack index if we already accounted for it
via the MIDX path above? My guess is not, so I would probably suggest
writing this conditional as:

    if (p->multi_pack_index || open_pack_index(p))
        continue;

to avoid loading pack indexes unless we have to.

Thanks,
Taylor
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help