Re: [PATCH 9/9] refs: reimplement refs_delete_refs() and run hook once
From: Michael Heemskerk <hidden>
Date: 2022-08-02 12:42:16
Let me re-share some questions/suggestions/objections I got on a patch I shared with similar changes: https://lore.kernel.org/git/pull.1228.git.1651676435634.gitgitgadget@gmail.com/ (local) There's a lot to like about the change; it fixes the incorrect invocation of the reference-transaction hooks when (bulk) deleting refs, but there is a down-side that Patrick pointed out. We never got to a satisfactory solution, so let me reshare his feedback to pick up the discussion. Patrick:
I really like these changes given that they simplify things, but I
wonder whether we can do them. In the preimage we're eagerly removing
loose refs: any error encountered when deleting a reference is recorded,
but we keep on trying to remove the other refs, as well. With the new
behaviour we now create a single transaction for all refs and try to
commit it. This also means that we'll abort the transaction when locking
any of the refs fails, which is a change in behaviour.
The current behaviour is explicitly documented in `refs.h:refs_delete_refs()`:
/*
* Delete the specified references. If there are any problems, emit
* errors but attempt to keep going (i.e., the deletes are not done in
* an all-or-nothing transaction). msg and flags are passed through to
* ref_transaction_delete().
*/
int refs_delete_refs(struct ref_store *refs, const char *msg,
struct string_list *refnames, unsigned int flags);
There are multiple callsites of this function via `delete_refs()`. Now
honestly, most of these callsites look somewhat broken:
- `bisect.c` simply does its best to clean up bisect state. This
usecase looks fine to me.
- `builtin/branch.c` reports the branches as deleted even if
`delete_refs()` failed.
- `builtin/remote.c` also misreports the deleted branches for the
`prune` verb. The `rm` verb looks alright: if deletion of any
branch failed then it doesn't prune the remote's config in the end
and reports an error.
- `builtin/fetch.c` also misreports deleted branches with `--prune`.
So most of these commands incorrectly handle the case where only a
subset of branches has been deleted. This raises the question whether
the interface provided by `refs_delete_refs()` is actually sensible if
it's so easy to get wrong. It doesn't even report which branches could
be removed and which couldn't. Furthermore, the question is whether new
backends like the reftable backend which write all refs into a single
slice would actually even be in a position to efficiently retain
semantics of this function.
I'm torn. There are valid usecases for eagerly deleting refs even if a
subset of deletions failed, making this change a tough sell, but most of
the callsites don't actually handle this correctly in the first place.At the time, the only solution I could see was to switch to transaction-per-ref semantics, but this results in bad performance when deleting tens of thousands of refs. One option might be to optimistically try to delete the refs in a single transaction. If that fails for whatever reason and multiple ref deletions are requested, we could fall back to a transaction-per-ref approach. That'd keep the common case fast, and still provide best effort deletes. Thoughts? Cheers, Michael Heemskerk On Fri, Jul 29, 2022 at 12:13 PM Jiang Xin [off-list ref] wrote:
quoted hunk ↗ jump to hunk
From: Jiang Xin <redacted> When delete references using "git branch -d" or "git tag -d", there will be duplicate call of "reference-transaction committed" for same refs. This is because "refs_delete_refs()" is called twice, once for files-backend and once for packed-backend, and we used to reinvented the wheel in "files_delete_refs()" and "packed_delete_refs()". By removing "packed_delete_refs()" and reimplement "files_delete_refs()", the "reference-transaction" hook will run only once for deleted branches and tags. The behavior of the following git commands and the last two testcases have been fixed in t1416: * git branch -d <branch> * git tag -d <tag> A testcase in t5510 is broken because we used to call the function "packed_refs_lock()", but it is not necessary if the deleted reference is not in the "packed-refs" file. Signed-off-by: Jiang Xin <redacted> --- refs/files-backend.c | 21 ++++++------- refs/packed-backend.c | 51 +------------------------------- t/t1416-ref-transaction-hooks.sh | 4 +-- t/t5510-fetch.sh | 17 +++++++++++ 4 files changed, 29 insertions(+), 64 deletions(-)diff --git a/refs/files-backend.c b/refs/files-backend.c index 8baea66e58..21426efaae 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c@@ -1268,31 +1268,27 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags) static int files_delete_refs(struct ref_store *ref_store, const char *msg, struct string_list *refnames, unsigned int flags) { - struct files_ref_store *refs = - files_downcast(ref_store, REF_STORE_WRITE, "delete_refs"); + struct ref_transaction *transaction; struct strbuf err = STRBUF_INIT; int i, result = 0; if (!refnames->nr) return 0; - if (packed_refs_lock(refs->packed_ref_store, 0, &err)) - goto error; - - if (refs_delete_refs(refs->packed_ref_store, msg, refnames, flags)) { - packed_refs_unlock(refs->packed_ref_store); + transaction = ref_store_transaction_begin(ref_store, &err); + if (!transaction) goto error; - } - - packed_refs_unlock(refs->packed_ref_store); for (i = 0; i < refnames->nr; i++) { const char *refname = refnames->items[i].string; - - if (refs_delete_ref(&refs->base, msg, refname, NULL, flags)) + if (ref_transaction_delete(transaction, refname, NULL, + flags, msg, &err)) result |= error(_("could not remove reference %s"), refname); } + if (ref_transaction_commit(transaction, &err)) + goto error; + ref_transaction_free(transaction); strbuf_release(&err); return result;@@ -1309,6 +1305,7 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg, else error(_("could not delete references: %s"), err.buf); + ref_transaction_free(transaction); strbuf_release(&err); return -1; }diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 97b6837767..fdb7a0a52c 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c@@ -1519,55 +1519,6 @@ static int packed_initial_transaction_commit(struct ref_store *ref_store, return ref_transaction_commit(transaction, err); } -static int packed_delete_refs(struct ref_store *ref_store, const char *msg, - struct string_list *refnames, unsigned int flags) -{ - struct packed_ref_store *refs = - packed_downcast(ref_store, REF_STORE_WRITE, "delete_refs"); - struct strbuf err = STRBUF_INIT; - struct ref_transaction *transaction; - struct string_list_item *item; - int ret; - - (void)refs; /* We need the check above, but don't use the variable */ - - if (!refnames->nr) - return 0; - - /* - * Since we don't check the references' old_oids, the - * individual updates can't fail, so we can pack all of the - * updates into a single transaction. - */ - - transaction = ref_store_transaction_begin(ref_store, &err); - if (!transaction) - return -1; - - for_each_string_list_item(item, refnames) { - if (ref_transaction_delete(transaction, item->string, NULL, - flags, msg, &err)) { - warning(_("could not delete reference %s: %s"), - item->string, err.buf); - strbuf_reset(&err); - } - } - - ret = ref_transaction_commit(transaction, &err); - - if (ret) { - if (refnames->nr == 1) - error(_("could not delete reference %s: %s"), - refnames->items[0].string, err.buf); - else - error(_("could not delete references: %s"), err.buf); - } - - ref_transaction_free(transaction); - strbuf_release(&err); - return ret; -} - static int packed_pack_refs(struct ref_store *ref_store, unsigned int flags) { /*@@ -1595,7 +1546,7 @@ struct ref_storage_be refs_be_packed = { .pack_refs = packed_pack_refs, .create_symref = NULL, - .delete_refs = packed_delete_refs, + .delete_refs = NULL, .rename_ref = NULL, .copy_ref = NULL,diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh index df75e5727c..f64166f9d7 100755 --- a/t/t1416-ref-transaction-hooks.sh +++ b/t/t1416-ref-transaction-hooks.sh@@ -744,7 +744,7 @@ test_expect_success "branch: rename branches" ' test_cmp_heads_and_tags -C workdir expect ' -test_expect_failure "branch: remove branches" ' +test_expect_success "branch: remove branches" ' test_when_finished "rm -f $HOOK_OUTPUT" && cat >expect <<-EOF &&@@ -873,7 +873,7 @@ test_expect_success "tag: update refs to create loose refs" ' test_cmp_heads_and_tags -C workdir expect ' -test_expect_failure "tag: remove tags with mixed ref_stores" ' +test_expect_success "tag: remove tags with mixed ref_stores" ' test_when_finished "rm -f $HOOK_OUTPUT" && cat >expect <<-EOF &&diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh index b45879a760..22de7ac9ec 100755 --- a/t/t5510-fetch.sh +++ b/t/t5510-fetch.sh@@ -168,6 +168,8 @@ test_expect_success REFFILES 'fetch --prune fails to delete branches' ' cd "$D" && git clone . prune-fail && cd prune-fail && + git update-ref refs/remotes/origin/extrabranch main~ && + git pack-refs --all && git update-ref refs/remotes/origin/extrabranch main && : this will prevent --prune from locking packed-refs for deleting refs, but adding loose refs still succeeds && >.git/packed-refs.new &&@@ -175,6 +177,21 @@ test_expect_success REFFILES 'fetch --prune fails to delete branches' ' test_must_fail git fetch --prune origin ' +test_expect_success REFFILES 'fetch --prune ok for loose refs not in locked packed-refs' ' + test_when_finished "cd \"$D\"; rm -rf \"prune-ok-ref-not-packed\"" && + cd "$D" && + git clone . prune-ok-ref-not-packed && + ( + cd prune-ok-ref-not-packed && + git update-ref refs/remotes/origin/extrabranch main && + : for loose refs not in packed-refs, we can delete them even the packed-refs is locked && + :>.git/packed-refs.new && + + git fetch --prune origin && + test_must_fail git rev-parse refs/remotes/origin/extrabranch -- + ) +' + test_expect_success 'fetch --atomic works with a single branch' ' test_when_finished "rm -rf \"$D\"/atomic" && --2.36.1.25.gc87d5ad63a.dirty