Thread (43 messages) 43 messages, 4 authors, 2025-12-02

Re: [PATCH v2] fetch: fix non-conflicting tags not being committed

From: Patrick Steinhardt <hidden>
Date: 2025-11-06 11:50:32

On Thu, Nov 06, 2025 at 09:39:25AM +0100, Karthik Nayak wrote:
The commit 0e358de64a (fetch: use batched reference updates, 2025-05-19)
updated the 'git-fetch(1)' command to use batched updates. This batches
updates to gain performance improvements. When fetching references, each
update is added to the transaction. Finally, when committing, individual
updates are allowed to fail with reason, while the transaction itself
succeeds.

One scenario which was missed here, was fetching tags. When fetching
conflicting tags, the `fetch_and_consume_refs()` function returns '1',
which skipped committing the transaction and directly jumped to the
cleanup section. This mean that no updates were applied.
Okay, this is obviously broken indeed.
This also extends to backfilling tags when using the now deprecated
'branches/' format for remotes.
I'm a bit lost here -- what does backfilling have to do with the
"branches/" directory? The backfill is supposed to create tags that
point into the history that one has just fetched. So:

  - With `--tags` we fetch all tags announced by the remote.

  - With `--no-tags` we fetch no tags.

  - Otherwise we fetch those tags that point into our history.

The last behaviour is a bit more on the esoteric side, but it's
described as such in git-fetch(1):

    By default, any tag that points into the histories being fetched is
    also fetched; the effect is to fetch tags that point at branches
    that you are interested in. This default behavior can be changed by
    using the --tags or --no-tags options or by configuring
    remote.<name>.tagOpt. By using a refspec that fetches tags
    explicitly, you can fetch tags that do not point into branches you
    are interested in as well.

The following test demonstrates this behaviour:

	test_expect_success "fetch single branch without explicit tag option" '
		git init source &&
		git -C source commit --allow-empty --message common &&
		git clone file://"$(pwd)"/source target &&
		(
			cd source &&
			git commit --allow-empty --message discard-me &&
			git tag discard-me &&
			git commit --amend --allow-empty --message fetch-me &&
			git tag fetch-me
		) &&

		# The "discard-me" tag does not point into the history that we are
		# about to fetch, so it should not have been created.
		git -C target fetch origin &&
		git -C target tag -l >actual &&
		echo "fetch-me" >expect &&

		# But with "--tags" we instruct git-fetch(1) to fetch all tags, so we
		# should now see it.
		git -C target fetch origin --tags &&
		git -C target tag -l >actual &&
		cat >expect <<-\EOF &&
		discard-me
		fetch-me
		EOF
		test_cmp expect actual
	'
quoted hunk ↗ jump to hunk
diff --git a/builtin/fetch.c b/builtin/fetch.c
index c7ff3480fb..d5aee5af10 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1686,6 +1686,42 @@ static void ref_transaction_rejection_handler(const char *refname,
 	*data->retcode = 1;
 }
 
+/*
+ * Commit the reference transaction. If it isn't an atomic transaction, handle
+ * rejected updates as part of using batched updates.
+ */
+static int commit_ref_transaction(struct ref_transaction **transaction,
+				  bool is_atomic, const char *remote_name,
+				  struct strbuf *err)
+{
+	int retcode = ref_transaction_commit(*transaction, err);
+	if (retcode) {
+		/*
+		 * Explicitly handle transaction cleanup to avoid
+		 * aborting an already closed transaction.
+		 */
+		ref_transaction_free(*transaction);
+		*transaction = NULL;
+	}
+
+	if (*transaction && !is_atomic) {
+		struct ref_rejection_data data = {
+			.conflict_msg_shown = 0,
+			.remote_name = remote_name,
+			.retcode = &retcode,
+		};
+
+		ref_transaction_for_each_rejected_update(*transaction,
+							 ref_transaction_rejection_handler,
+							 &data);
+
+		ref_transaction_free(*transaction);
+		*transaction = NULL;
+	}
Okay. Do we need to discern cases where this is called and we haven't
managed to even queue a single reference update?
+	return retcode;
+}
+
 static int do_fetch(struct transport *transport,
 		    struct refspec *rs,
 		    const struct fetch_config *config)
Nit: it might make sense to have a preparatory commit that extracts the
function but that is otherwise a no-op change.
quoted hunk ↗ jump to hunk
@@ -1826,6 +1862,10 @@ static int do_fetch(struct transport *transport,
 
 	if (fetch_and_consume_refs(&display_state, transport, transaction, ref_map,
 				   &fetch_head, config)) {
+		/* As we're using batched updates, commit any pending updates. */
+		if (!atomic_fetch)
+			commit_ref_transaction(&transaction, false,
+					       transport->remote->name, &err);
 		retcode = 1;
 		goto cleanup;
 	}
Hm. Don't we also have to unset the transaction now? Ah, no, you pass
the pointer to the transaction here and set it to `NULL` in
`commit_ref_transaction()`. Makes sense.
quoted hunk ↗ jump to hunk
@@ -1848,8 +1888,12 @@ static int do_fetch(struct transport *transport,
 			 * the transaction and don't commit anything.
 			 */
 			if (backfill_tags(&display_state, transport, transaction, tags_ref_map,
-					  &fetch_head, config))
+					  &fetch_head, config)) {
+				if (!atomic_fetch)
+					commit_ref_transaction(&transaction, false,
+							       transport->remote->name, &err);
 				retcode = 1;
+			}
 		}
 
 		free_refs(tags_ref_map);
We now have three different callsites where we commit the transaction.
It gets better due to the newly introduced function, but it overall
feels somewhat fragile regardless of that.

Thanks!

Patrick
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help