Thread (9 messages) 9 messages, 4 authors, 2024-10-14

Re: [External] Re: Missing Promisor Objects in Partial Repo Design Doc

From: Jonathan Tan <hidden>
Date: 2024-10-14 17:52:07

Han Young [off-list ref] writes:
On Sat, Oct 12, 2024 at 10:05 AM Jonathan Tan [off-list ref] wrote:
quoted
So I think we'll need to use do_not_die_on_missing_objects. It does have
the weakness that if the object is not supposed to be missing, we don't
inform the user, but perhaps this is OK here because we know that all
objects we encounter on this object walk are promisor objects, so if
it's missing, it's OK.
And I think users would prefer the git command to succeed if possible,
rather than die on the first (noncritical) error. Maybe show a warning
and swallow the error?
Just to be clear, this is not an error condition so we shouldn't show an
error or warning. Whenever we repack non-promisor objects in a partial
clone we will almost always encounter missing objects. In the gc repack,
this is mitigated by --exclude-promisor-objects, which checks the
promisor object set whenever a missing object is encountered; it does
not show an error if the missing object is in that set.

My proposal is to use do_not_die_on_missing_objects instead, which
always tolerates missing objects (without showing any error or warning).
This is probably not safe enough for the gc repack, but should be OK
for the fetch repack, since we are only repacking ancestors of known
promisor objects (so we can deduce that the missing objects are promisor
objects).
quoted
In addition to do_not_die_on_missing_objects, we'll also need the actual
code that stops iteration through objects that pass our "best effort"
promisor object check. Probably the best place is in get_revision_1()
after the NULL check
get_revision_1() only does commit limiting though. Some callers of rev-list
also do tree walking on commits,
Ah, yes, you're right. The repack on fetch is one such caller (that will
need tree walking).
in a (corrupted) partial repo, tree could
also be missing. There isn't a central place we can stop tree walking,
callers using this feature would have to implement "tree walking early
termination" themself.
The repo could have been cloned with a tree filter (e.g.
"--filter=tree:0") too, in which case trees would be missing even if the
repo is not corrupted. But even in a non-corrupted --filter=blob:none
partial clone, we still don't want to iterate through promisor trees, so
that we don't repack them unnecessarily. So yes, get_revision_1() is not
the only place that needs to be changed.

I think that there is a central place to stop tree walking - in
list-objects.c.

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help