Thread (34 messages) 34 messages, 3 authors, 2025-10-30

Re: [PATCH 5/8] builtin/pack-objects: simplify logic to find kept or nonlocal objects

From: Taylor Blau <hidden>
Date: 2025-10-29 23:13:36

On Tue, Oct 28, 2025 at 12:08:35PM +0100, Patrick Steinhardt wrote:
quoted hunk ↗ jump to hunk
@@ -4388,27 +4388,27 @@ static void add_unreachable_loose_objects(struct rev_info *revs)

 static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
 {
-	struct packfile_store *packs = the_repository->objects->packfiles;
 	static struct packed_git *last_found = (void *)1;
 	struct packed_git *p;

-	p = (last_found != (void *)1) ? last_found :
-					packfile_store_get_packs(packs);
+	if (last_found != (void *)1 && find_pack_entry_one(oid, last_found))
+		return 1;
This all looks right to me, since we only assign last_found when a pack
meets the kept-or-non-local criteria, which is good.
-	while (p) {
-		if ((!p->pack_local || p->pack_keep ||
-				p->pack_keep_in_core) &&
-			find_pack_entry_one(oid, p)) {
+	repo_for_each_pack(the_repository, p) {
+		if ((!p->pack_local || p->pack_keep || p->pack_keep_in_core) &&
+		    find_pack_entry_one(oid, p)) {
 			last_found = p;
 			return 1;
 		}
-		if (p == last_found)
-			p = packfile_store_get_packs(packs);
-		else
-			p = p->next;
-		if (p == last_found)
-			p = p->next;
+
+		/*
+		 * We have already checked `last_found`, so there is no need to
+		 * re-check here.
+		 */
+		if (p == last_found && last_found != (void *)1)
+			continue;
Can 'p' ever be (void *)1 here? I would imagine not since this is coming
from repo_for_each_pack(), so I think it would suffice to limit this
conditional to just "if (p == last_found)".

Otherwise looks good. I think you could make use of the kept_cache here
at least for the local-but-kept packs, but what you wrote is definitely
an improvement in readability.

    static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
    {
            static struct packed_git *last_found = (void *)1;
            uint32_t kept_flags = ON_DISK_KEEP_PACKS | IN_CORE_KEEP_PACKS;
            struct packed_git *p;
            struct pack_entry entry;

            if (last_found != (void *)1 && find_pack_entry_one(oid, last_found))
                    return 1;

            if (find_kept_pack_entry(the_repository, oid, flags, &entry)) {
                    last_found = entry.p;
                    return 1;
            }

            repo_for_each_pack(the_repository, p) {
                    if (p->pack_local || p == last_found)
                            continue;
                    if (find_pack_entry_one(oid, p)) {
                            last_found = p;
                            return 1;
                    }
            }
            return 0;
    }

You still end up looping over all of the packs in the worst case, but in
the case where you are going to pick up a copy of the object via a kept
pack, the above should be faster since it will focus the search on those
packs first.

I'm not sure that it matters all that much, since at worst we're
interspersing the search over kept packs with some non-local ones, and
skipping over the rest. But certainly you could imagine cases where the
number of non-local packs greatly outnumbers the local, kept ones, so in
a situation like that I think the above would help.

Probably micro-optimizing this case is not all that useful, since this
only comes up when we are unpacking unreachable objects like with 'git
repack -A', which should be using cruft packs anyway.

Thanks,
Taylor
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help