Re: [RFC PATCH v2 1/2] fetch: set-head with --set-head option

From: Bence Ferdinandy <hidden>
Date: 2024-09-11 12:19:37

Possibly related (same subject, not in this thread)

2024-09-11 · Re: [RFC PATCH v2 1/2] fetch: set-head with --set-head option · Junio C Hamano <hidden>
2024-09-10 · [RFC PATCH v2 1/2] fetch: set-head with --set-head option · Bence Ferdinandy <hidden>

Hey Junio,

thanks for the very speedy feedback!

On Wed Sep 11, 2024 at 00:19, Junio C Hamano [off-list ref] wrote:

[snip]

Ideally, because every "git fetch" you do will automatically learn
what their HEAD of the day points at, even without "--set-head", it
may be nice to let the user know when their HEAD changed, so that
the user can inspect the situation and decide.

That would actually make sense, we could print a message saying HEAD has
changed and I guess helpfully print the exact set-head command they would need
to manually update should they wish to do so.

If "fetch $repo", when it notices that refs/remotes/$repo/HEAD is
missing, always unconditionally stores where their HEAD points at
in refs/remotes/$repo/HEAD, and did nothing else, wouldn't that be
sufficient?

The users have "remote set-head" to do this when needed.  What is
the true motivation that "fetch" (which presumably happens a lot
more often) needs to be involved in this process?  The *only* upside
I can see with "fetch --set-head" to blindly follow every switch of
HEAD on the remote end is that you can keep up with the remote that
flips its HEAD very often, but is that really a realistic need?  If
we reject such a use case as invalid, I suspect that the end-user
experience would be simplified quite a lot.  Imagine that we teach
"git fetch $repo" to notice refs/remotes/$repo/HEAD is missing, and
create it from the remote HEAD information automatically.  And we do
NOTHING ELSE.  What would the end-user experience look like?

 * Usually, you start with "git clone" and 'origin' will know which
   branch 'origin/HEAD' points at.

 * You may run "git remote add -f $repo $URL" to add one.  Because
   this runs "git fetch $repo", the previous addition to the "git
   fetch" will make sure refs/remotes/$repo/HEAD would be there.

 * You may run "git remote add $repo $URL" to add one, and then
   separately "git fetch $repo" yourself.  The end result would be
   the same; refs/remotes/$repo/HEAD will be there.

I'm convinced :) It also does drop a lot of complexity. So there will be no
extra flag for fetch, rather:

- if the remote HEAD does not exist, we create it

- if it does exist, but has changed we print a message saying it was changed from X to Y and print the required command to update to this

- no configuration needed

The only place I can imagine this not being sufficient is in a CI/CD process
that is fetching into a cached repo needs to be updated, but then those people
can just always run remote set-head -a.

Having said all that, the implementation here ...

quoted

+static int run_set_head(const char *name)
+{
+	struct child_process cmd = CHILD_PROCESS_INIT;
+	strvec_push(&cmd.args, "remote");
+	strvec_push(&cmd.args, "set-head");
+	strvec_push(&cmd.args, "--auto");
+	strvec_push(&cmd.args, name);
+	cmd.git_cmd = 1;
+	return run_command(&cmd);
+}

... does look quite straight-forward.

Unfortunately, as Jeff has pointed out in the other thread, this implementation
requires the user to authenticate twice ... Now, we could still rely on
set-head to actually do the writing, since if you explicitly give it what to
set to it will not do a network query, but considering all of the above,
I think it would make more sense not to do this, rather just bring in the
required logic here. This would also allow for the second patch to be a bit
more explicit on what did or did not happen during a remote set-head.

quoted

+static int fetch_multiple(struct string_list *list, int max_children, int set_head,
+			const struct fetch_config *config)
 {
 	int i, result = 0;
 	struct strvec argv = STRVEC_INIT;

@@ -2014,6 +2025,8 @@ static int fetch_multiple(struct string_list *list, int max_children,
 				error(_("could not fetch %s"), name);
 				result = 1;
 			}
+			if (set_head && run_set_head(name))
+				result = 1;

This looked a bit questionable---I expected that we'd run the
subfetches with "--set-head" option, but this makes us do the
set-head step ourselves after the subfetches are done.  What are
pros-and-cons considered to reach the decision to do it this way?

Just noobish oversight, I figured out how to do it in the subfetch, but with
the above there will be no need for this anyway I guess.

I'll send a v2 soonish.

Best,
Bence

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help