Thread (27 messages) 27 messages, 6 authors, 2022-02-23

Re: [PATCH v2 0/4] [RFC] repack: add --filter=

From: Robert Coup <hidden>
Date: 2022-02-21 15:38:22

Hi,

On Mon, 21 Feb 2022 at 03:11, Taylor Blau [off-list ref] wrote:
we would still be leaving repository
corruption on the table, just making it marginally more difficult to
achieve.
While reviewing John's patch I initially wondered if a better approach
might be something like `git repack -a -d --exclude-stdin`, passing a
list of specific objects to exclude from the new pack (sourced from
rev-list via a filter, etc). To me this seems like a less dangerous
approach, but my concern was it doesn't use the existing filter
capabilities of pack-objects, and we end up generating and passing
around a huge list of oids. And of course any mistakes in the list
generation aren't visible until it's too late.

I also wonder whether there's a race condition if the repository gets
updated? If you're moving large objects out in advance, then filtering
the remainder there's nothing to stop a new large object being pushed
between those two steps and getting dropped.

My other idea, which is growing on me, is whether repack could
generate two valid packs: one for the included objects via the filter
(as John's change does now), and one containing the filtered-out
objects. `git repack -a -d --split-filter=<filter>` Then a user could
then move/extract the second packfile to object storage, but there'd
be no way to *accidentally* corrupt the repository by using a bad
option. With this approach the race condition above goes away too.

    $ git repack -a -d -q --split-filter=blob:limit=1m
    pack-37b7443e3123549a2ddee31f616ae272c51cae90
    pack-10789d94fcd99ffe1403b63b167c181a9df493cd

First pack identifier being the objects that match the filter (ie:
commits/trees/blobs <1m), and the second pack identifier being the
objects that are excluded by the filter (blobs >1m).

An astute --i-know-what-im-doing reader could spot that you could just
delete the second packfile and achieve the same outcome as the current
proposed patch, subject to being confident the race condition hadn't
happened to you.

Thanks,
Rob :)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help