Thread (6 messages) 6 messages, 4 authors, 2022-02-22

Re: [PATCH v3 04/15] merge-tree: implement real merges

From: Johannes Schindelin <hidden>
Date: 2022-02-21 09:14:23
Subsystem: the rest · Maintainer: Linus Torvalds

Possibly related (same subject, not in this thread)

Hi,

On Thu, 3 Feb 2022, Elijah Newren wrote:
On Thu, Feb 3, 2022 at 2:42 AM Johannes Altmanninger [off-list ref] wrote:
quoted
On Wed, Feb 02, 2022 at 04:18:39PM -0800, Elijah Newren wrote:
quoted
On Wed, Feb 2, 2022 at 2:01 PM Junio C Hamano [off-list ref] wrote:
quoted
Elijah Newren [off-list ref] writes:
quoted
Yes, you are reading right.  I think the cherry-pick/rebase
replacement actually deserves a separate command from what merges
should use; replaying a sequence of commits just has a number of UI
differences and abilities that I think pull it in a different
direction.
I completely disagree.  Each individual step in a sequence of
replaying commits in order (or in reverse order) should be
scriptable as a single merge-tree that takes "apply the change to go
from A^ to A on X".  Sequencing and placing UI around it is a job
for the script that drives merge-tree.
Adding such an ability to merge-tree would be trivial -- it basically
involves just two things: (1) accepting one extra argument, and (2)
calling merge_incore_nonrecursive() instead of
merge_incore_recursive().

However, I think forking a subprocess for every merge of a series of
commits is a completely unreasonable overhead, so even if we provide
such an option to merge-tree, I still want a separate plumbing-ish
tool that does non-worktree/non-index replaying of commits which is
not written as a driver of merge-tree.  That other tool should just
call merge_incore_nonrecursive() directly.  And such a tool, since it
should handle an arbitrary number of commits, should certainly be able
to handle just one commit.  From that angle, it feels like adding
another mode to merge-tree would just be a partial duplication of the
other tool.
I wonder how the UI of a tool that does non-worktree/non-index cherry-picks
will look like.  I'd expect it to produce the same output as merge-tree,
except cherry-pick should probably output a commit OID, not a tree.

Maybe we want a unified command that produces commits from any sequence of
merge/cherry-pick/revert/reword steps. The obvious UI would use something
like the rebase-todo list as input.  For example:

        $ echo '
        pick commit1
        reword commit2  # edit commit message in $GIT_EDITOR
        merge commit3 -m "log message"
        ' | git create-commit commit0
        <OID of final commit>

we start from commit0 and apply steps one-by-one. Obviously, one unsolved
problem is how to pass parameters like commit messages if no editor should
be invoked (my sketch uses -m).
If any of the steps fails when merging merge, then we get the tree with
conflicts

        $ echo '
        pick commit1
        pick commit2
        pick commit-that-does-not-apply
        ' | git create-commit commit0
        <OID of commit after step 2>
        <OID of toplevel tree after failed merge>
        <Conflicted file info>
        <Informational messages>

Replaying a series of commits might look like this:

        $ echo 'pick commit1 ^commit0' | git create-commit new-base

I'm concluding that this is a difficult UI problem
I agree.  I've got a lot of thoughts on it, and some work in progress
towards it (https://github.com/newren/git/tree/replay -- _very_ hacky,
not even close to alpha quality, lots of fixup commits, todo comments,
random brain dump files added to the tree, based on a previous round
of this patch series, not updated for weeks, etc., etc.)
Just chiming in that I find that very exciting. But it's a tangent, and
slightly distracting from the topic at hand, so I would like to ask to
focus back on server-side merges.
quoted
and having a merge-tree command that accepts a "common ancestor"
parameter could make it easier to experiment.  Of course that depends
on who is experimenting.
I think that would result in experiments and eventually full-blown
scripts designed around forking subprocesses for every merge, and
pushes us back into the world of having a scripted-rebase again.  Yes,
I know people can transliterate shell back to C; it seems to always be
done as a half-way measure with the forking just being done from C or
have other UI-warts guided by the shell design.  In fact, *that* was
the primary reason for me not providing a merge-tree option based on
merge_incore_nonrecursive(), despite how trivial it'd be to provide
it.  If someone wanted a merge_incore_nonrecursive() mode for
merge-tree for reasons other than attempting to build a
rebase/cherry-pick replacement based on it, then I'd be much happier
to provide it.

If someone wants to experiment with what a plumbing-ish
rebase/cherry-pick would look like, the _right_ way to do it would be
making using of merge_incore_nonrecursive() directly.  If they want
example code, I already provided some a year and a half ago and got it
merged into git.git in the form of t/helper/test-fast-rebase.c.  My
"replay" branch is based on that code, but (a) moves it from t/helper
to a real builtin, (b) removes the hardcoded very strict input, (c)
removes the line of code doing the index & working tree updates, and
(d) modifies the output to be a more plumbing-ish style.
I actually implemented that so I could provide apples-to-apples
speed comparisons between libgit2 and merge-ort:

-- snip --
From 6a865c691810b67dc15ddb57ad110bd6fdfc2f12 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <redacted>
Date: Fri, 28 Jan 2022 23:28:20 +0100
Subject: [PATCH] merge-tree: optionally force a simple 3-way merge

Signed-off-by: Johannes Schindelin <redacted>
---
 builtin/merge-tree.c | 72 ++++++++++++++++++++++++++++++--------------
 1 file changed, 50 insertions(+), 22 deletions(-)
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index 58c0ddc5a3..1007aaaede 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -396,6 +396,7 @@ struct merge_tree_options {
 	int allow_unrelated_histories;
 	int show_messages;
 	int exclude_modes_oids_stages;
+	const char *nonrecursive_base;
 };

 static int real_merge(struct merge_tree_options *o,
@@ -409,34 +410,58 @@ static int real_merge(struct merge_tree_options *o,
 	struct merge_options opt;
 	struct merge_result result = { 0 };

-	parent1 = get_merge_parent(branch1);
-	if (!parent1)
-		help_unknown_ref(branch1, "merge-tree",
-				 _("not something we can merge"));
-
-	parent2 = get_merge_parent(branch2);
-	if (!parent2)
-		help_unknown_ref(branch2, "merge-tree",
-				 _("not something we can merge"));
-
 	init_merge_options(&opt, the_repository);

 	opt.show_rename_progress = 0;

-	opt.branch1 = branch1;
-	opt.branch2 = branch2;
+	if (o->nonrecursive_base) {
+		struct object_id base_oid, head_oid, merge_oid;
+		struct tree *base_tree, *head_tree, *merge_tree;
+
+		opt.ancestor = "(base)";
+		opt.branch1 = "(branch1)";
+		opt.branch2 = "(branch2)";
+
+		if (get_oid_treeish(o->nonrecursive_base, &base_oid))
+			die("could not parse base '%s'", o->nonrecursive_base);
+		base_tree = parse_tree_indirect(&base_oid);
+		if (get_oid_treeish(branch1, &head_oid))
+			die("could not parse head '%s'", branch1);
+		head_tree = parse_tree_indirect(&head_oid);
+		if (get_oid_treeish(branch2, &merge_oid))
+			die("could not parse merge '%s'", branch2);
+		merge_tree = parse_tree_indirect(&merge_oid);
+
+		merge_incore_nonrecursive(&opt,
+					  base_tree, head_tree, merge_tree,
+					  &result);
+	} else {
+		parent1 = get_merge_parent(branch1);
+		if (!parent1)
+			help_unknown_ref(branch1, "merge-tree",
+					 _("not something we can merge"));
+
+		parent2 = get_merge_parent(branch2);
+		if (!parent2)
+			help_unknown_ref(branch2, "merge-tree",
+					 _("not something we can merge"));
+
+		opt.branch1 = branch1;
+		opt.branch2 = branch2;

-	/*
-	 * Get the merge bases, in reverse order; see comment above
-	 * merge_incore_recursive in merge-ort.h
-	 */
-	common = get_merge_bases(parent1, parent2);
-	if (!common && !o->allow_unrelated_histories)
-		die(_("refusing to merge unrelated histories"));
-	for (j = common; j; j = j->next)
-		commit_list_insert(j->item, &merge_bases);
+		/*
+		 * Get the merge bases, in reverse order; see comment above
+		 * merge_incore_recursive in merge-ort.h
+		 */
+		common = get_merge_bases(parent1, parent2);
+		if (!common && !o->allow_unrelated_histories)
+			die(_("refusing to merge unrelated histories"));
+		for (j = common; j; j = j->next)
+			commit_list_insert(j->item, &merge_bases);
+
+		merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
+	}

-	merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
 	if (result.clean < 0)
 		die(_("failure to merge"));
@@ -501,6 +526,9 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix)
 			   &o.allow_unrelated_histories,
 			   N_("allow merging unrelated histories"),
 			   PARSE_OPT_NONEG),
+		OPT_STRING(0, "force-non-recursive-base", &o.nonrecursive_base,
+			   N_("base-tree"),
+			   N_("force a simple three-way merge")),
 		OPT_END()
 	};

-- snap --
I do strongly agree that this should _not_ enter core Git's code, I just
provide this in case someone else wants to play with merge-ort on the
server side in an existing code base.
We'll certainly have discussions on what that should look like.  But a
plumbing-ish replacement for merge was much simpler, and made sense to
do first.  I would prefer to concentrate on getting that hammered down
first.  Then I'll start discussions on a plumbing-ish
rebase/cherry-pick.  And if that doesn't fulfill all the needs that
folks think they want out of merge-tree, then we can add a
merge_incore_nonrecursive()-based mode to merge-tree.  It's all
coming, but having fought transliterations-of-scripts in
merge-recursive.c, sequencer.c, stash.c, rebase.c, etc. for years I
really, really don't want any more of that.  Let's end that insanity.
Being the driving force behind many a "built-in-ification" of scripted
commands, I wholeheartedly agree. You can still see the fall-out of
designing commands in a scripted fashion, without any way to represent
data structures other than strings. I wish we had come up with a better
design to prototype commands than to write shell scripts. But I have to
admit that even I do not have any better idea than to work on a proper API
for libgit.a (which has historically invariably seen push-back from
Junio).

While I agree that this discussion is a valuable one, right now I would
like to focus on getting the server-side merges done, and once that has
happened, move on to the replay/sequencer/API discussion (which will
probably be a big one, not so much for technical reasons but more for all
too human ones).

Ciao,
Dscho
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help