On Thu, 26 Oct 2006, Junio C Hamano wrote:
I'd almost say "heavy repository-wide operations like 'repack -a
-d' and 'prune' should operate under a single repository lock",
but historically we've avoided locks and instead tried to do
things optimistically and used compare-and-swap to detect
conflicts, so maybe that avenue might be worth pursuing.
How about (I'm thinking aloud and I'm sure there will be
holes -- I won't think about prune for now)...
* "repack -a -d":
(1) initially run show-ref (or "ls-remote .") and store the
result in .git/$ref_pack_lock_file;
(2) enumerate existing packs;
(3) do the usual "rev-list --all | pack-objects" thing; this
may end up including more objects than what are reachable
from the result of (1) if somebody else updates refs in the
meantime;
(4) enumerate existing packs; if there is difference from (2)
other than what (3) created, that means somebody else added
a pack in the meantime; stop and do not do the "-d" part;
(5) run "ls-remote ." again and compare it with what it got in
(1); if different, somebody else updated a ref in the
meantime; stop and do not do the "-d" part;
(6) do the "-d" part as usual by removing packs we saw in (2)
but do not remove the pack we created in (3);
(7) remove .git/$ref_pack_lock_file.
* "fetch --thin" and "index-pack --stdin":
(1) check the .git/$ref_pack_lock_file, and refuse to operate
if there is such (this is not strictly needed for
correctness but only to give an early exit);
I don't think this is a good idea. A fetch should always work
irrespective of any repack taking place. The fetch really should have
priority over a repack since it is directly related to the user
experience. The repack can fail or produce suboptimal results if a race
occurs, but the fetch must not fail for such a reason.
(2) create a new pack under a temporary name, and when
complete, make the pack/index pair .pack and .idx;
Actually this is what already happens if you don't specify a name to
git-index-pack --stdin.
(3) update the refs.
So the actual race is the really small interval between the time the new
pack+index are moved to .git/objects/pack/ and the moment the refs are
updated. In practice this is probably less than a second. All that is
needed here is to somehow go back to (2) if that interval occurs between
(2) and (3).