Thread (2 messages) 2 messages, 2 authors, 2025-05-30

Re: [External] Re: [PATCH] promisor-remote: remove the promisor object check for failed fetch

From: Han Young <hidden>
Date: 2025-05-30 13:26:22

On Thu, May 29, 2025 at 11:23 PM Junio C Hamano [off-list ref] wrote:
So your "agonizingly slow" comes from just one
single call to is_promisor_object() function is ultra slow?
That's correct, is_promisor_object() function constructs the promisor
objects oid set by iterating through every object in the promisor
packfiles and calling add_promisor_object() function for each object.
add_promisor_object() function parses the object, adds itself and the
objects it refers to to the oid set. For a very large partial clone
repository, the promisor packfiles will contain millions of objects.
Parsing each one takes a considerable amount of time.
But at the same time, it sounds like is_promisor_object() seriously
is wrong.  Perhaps we need to tell pack-objects to pre-compute the
packfile.c:add_promisor_object() stuff and cache the result in an
on-disk file, just like reverse index is stored in an auxiliary
file?
As Calvin summarized in this thread [1], creating a promisor object set
is very expensive. The fact that a local object can become a
"promisor object" makes disk caching infeasible.
is_promisor_object() function is not inherently wrong, it's the definition
of "promisor object" that makes implementation difficult.
What do the callers of the function use to "filter out local
objects" to ensure that "all objects passed ... are promisor
objects" do?  Have they already spent agonizingly large amount of
time to do so?
They call oid_object_info_extended() function to check whether the object
exists in the local storage. Only objects that do not exist locally are
sent to promisor_remote_get_direct().
So objects in the
array are either promisor objects or missing due to repository
corruption---we simply cannot tell.  I suspect that the claim
"everything the caller calls the function with is a promisor object"
is not exactly correct.
You are right that objects that are not present in the local repository
do not equal promisor objects. The array could contain missing objects
that are not promisor objects.
The end-result when remaining_nr is not zero (i.e., some objects
that the caller wanted us to be fetched) would not exactly be the
same.  The function used to die only when the object we failed to
obtain was what a promisor remote promised to give us.  With this
change, we also die when an object the caller asked us to fetch is
not promised by any promisor.  I do not know what the implication
of this behaviour change would be.
I tested this patch with git-cat-file and git-diff on a repository
missing a normal object. The commands fail regardless of the patch,
but the error message is different. The message changed from
fatal: Not a valid object name
to
fatal: could not fetch ... from promisor remote

[1] https://lore.kernel.org/git/CAFySSZCyoaKCGycYgJjCJGJ2mV1yfg+gVFb7RytGKmkjupkNkQ@mail.gmail.com/ (local)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help