Re: [PATCH 0/6] [RFC] partial-clone: add ability to refetch with expanded filter
From: Jeff Hostetler <hidden>
Date: 2022-02-07 19:46:53
On 2/1/22 10:49 AM, Robert Coup via GitGitGadget wrote:
If a filter is changed on a partial clone repository, for example from
blob:none to blob:limit=1m, there is currently no straightforward way to
bulk-refetch the objects that match the new filter for existing local
commits. This is because the client will report commits as "have" during
negotiation and any dependent objects won't be included in the transferred
pack. Another use case is discussed at [1].
This patch series proposes adding a --refilter option to fetch & fetch-pack
to enable doing a full fetch with a different filter, as if the local has no
commits in common with the remote. It builds upon cbe566a071
("negotiator/noop: add noop fetch negotiator", 2020-08-18).
To note:
1. This will produce duplicated objects between the existing and newly
fetched packs, but gc will clean them up.
2. This series doesn't check that there's a new filter in any way, whether
configured via config or passed via --filter=. Personally I think that's
fine.
3. If a user fetches with --refilter applying a more restrictive filter
than previously (eg: blob:limit=1m then blob:limit=1k) the eventual
state is a no-op, since any referenced object already in the local
repository is never removed. Potentially this could be improved in
future by more advanced gc, possibly along the lines discussed at [2].Yes, it would be nice to have a way to efficiently extend a partial clone with a more inclusive filter. It would be nice to be able to send the old filter-spec and the new filter-spec and ask the server to send "new && !old" to keep from having to resend the objects that the client already has. But I'm not sure we know enough (on either side) to make that computation. And as you say, there is no guarantee that the client has only used one filter in the past. Jeff