Re: [PATCH v6 09/13] merge-octopus: rewrite in C
From: Derrick Stolee <hidden>
Date: 2021-01-05 16:41:40
On 11/24/2020 6:53 AM, Alban Gruin wrote:
This rewrites `git merge-octopus' from shell to C. As for the two last conversions, this port removes calls to external processes to avoid reading and writing the index over and over again.
...
quoted hunk ↗ jump to hunk
diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c new file mode 100644 index 0000000000..ca8f9f345d --- /dev/null +++ b/builtin/merge-octopus.c@@ -0,0 +1,69 @@ +/* + * Builtin "git merge-octopus" + * + * Copyright (c) 2020 Alban Gruin + * + * Based on git-merge-octopus.sh, written by Junio C Hamano. + * + * Resolve two or more trees. + */ + +#define USE_THE_INDEX_COMPATIBILITY_MACROS
Hm. It would be best if this was not added to any new code. Please squash in changes that avoid these macros.
+#include "cache.h"
+#include "builtin.h"
+#include "commit.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_octopus_usage[] =
+ "git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
+
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
+{
+ int i, sep_seen = 0;One strategy I've been trying to do, when I remember, is to start each builtin with struct repository *repo = the_repository; then use 'repo' over 'the_repository' and other macros. This gets us closer to the state where the builtin cmd_*() methods could take a repository pointer as a parameter. (We are a ways off, but every little bit helps, right?)
+ /* + * Reject if this is not an octopus -- resolve should be used + * instead. + */ + if (commit_list_count(remotes) < 2) + return 2;
This caused me to pause, since it might be nice to have a warning message here. However, it is identical behavior to the script, including the comment. It appears that there is no Documentation/git-merge-octopus.txt, but such a doc file would want to include the meaning of exit code 2.
quoted hunk ↗ jump to hunk
diff --git a/merge-strategies.c b/merge-strategies.c index 9aa07e91b5..4d9dd55296 100644 --- a/merge-strategies.c +++ b/merge-strategies.c@@ -1,5 +1,6 @@ #include "cache.h" #include "cache-tree.h" +#include "commit-reach.h"
You had my curiosity, but now you have my attention. ;)
+int merge_strategies_octopus(struct repository *r,
+ struct commit_list *bases, const char *head_arg,
+ struct commit_list *remotes)
+{
+ int ff_merge = 1, ret = 0, references = 1;
+ struct commit **reference_commit, *head_commit;'reference_commit' might be clearer if it was plural, right?
+ struct tree *reference_tree, *head_tree;
+ struct commit_list *i;
+ struct object_id head;
+ struct strbuf sb = STRBUF_INIT;
+
+ get_oid(head_arg, &head);
+ head_commit = lookup_commit_reference(r, &head);
+ head_tree = repo_get_commit_tree(r, head_commit);
+
+ if (parse_tree(head_tree))
+ return 2;
+
+ if (repo_index_has_changes(r, head_tree, &sb)) {
+ error(_("Your local changes to the following files "
+ "would be overwritten by merge:\n %s"),
+ sb.buf);
+ strbuf_release(&sb);
+ return 2;
+ }
+
+ reference_commit = xcalloc(commit_list_count(remotes) + 1,
+ sizeof(struct commit *));
+ reference_commit[0] = head_commit;
+ reference_tree = head_tree;
+
+ for (i = remotes; i && i->item; i = i->next) {
+ struct commit *c = i->item;
+ struct object_id *oid = &c->object.oid;
+ struct tree *current_tree = repo_get_commit_tree(r, c);
+ struct commit_list *common, *j;
+ char *branch_name;
+ int k = 0, up_to_date = 0;
+
+ if (ret) {
+ /*
+ * We allow only last one to have a
+ * hand-resolvable conflicts. Last round failed
+ * and we still had a head to merge.
+ */
+ puts(_("Automated merge did not work."));
+ puts(_("Should not be doing an octopus."));
+
+ free(reference_commit);
+ return 2;
+ }
+
+ branch_name = merge_get_better_branch_name(oid_to_hex(oid));
+ common = get_merge_bases_many(c, references, reference_commit);Here we are. You should probably use repo_get_merge_bases_many(). 'references' is not a list, but instead a count. Could it be renamed nr_references or something?
+
+ if (!common) {
+ error(_("Unable to find common commit with %s"), branch_name);
+
+ free(branch_name);
+ free_commit_list(common);
+ free(reference_commit);
+
+ return 2;hm. we are getting into magic constant territory. Perhaps this should be marked with a macro in merge-strategies.h? It could be used in the case of "only two heads" as well.
+ }
+
+ for (j = common; j && !(up_to_date || !ff_merge); j = j->next) {
+ up_to_date |= oideq(&j->item->object.oid, oid);
+
+ if (k < references)
+ ff_merge &= oideq(&j->item->object.oid, &reference_commit[k++]->object.oid);I'm confused about this line. Shouldn't we care only about reference_commit[references]? If we _do_ care about all possible reference_commit[k] values, then shouldn't this be a loop over the k values, not a single check per k (and advancing as we iterate through the results from common)? Seems we could use some test cases around criss-cross octopus merges (i.e. multiple merge bases).
+ }
+
+ if (up_to_date) {
+ printf(_("Already up to date with %s\n"), branch_name);
+
+ free(branch_name);
+ free_commit_list(common);
+ continue;
+ }
+
+ if (ff_merge) {
+ ret = octopus_fast_forward(r, branch_name, head_tree,
+ current_tree, &reference_tree);
+ references = 0;
+ } else {
+ ret = octopus_do_merge(r, branch_name, common,
+ current_tree, &reference_tree);
+ }
+
+ free(branch_name);
+ free_commit_list(common);
+
+ if (ret == -1)
+ break;
+
+ reference_commit[references++] = c;
+ }
+
+ free(reference_commit);
+ return ret;
+}This patch could use a little work, but it's a good start. Thanks, -Stolee