Thread (17 messages) 17 messages, 3 authors, 2022-02-22

Re: [RFC PATCH 0/2] Introduce new merge-tree-ort command

From: Elijah Newren <hidden>
Date: 2022-01-05 16:54:03

On Wed, Jan 5, 2022 at 8:33 AM Christian Couder
[off-list ref] wrote:
During the 2nd Virtual Git Contributors’ Summit last October, and even
before, the subject of performing server side merges and rebases came
up, as platforms like GitHub and GitLab would like to support many
features and data formats that libgit2 doesn't support, like for
example SHA256 hashes and partial clone.

It's hard for them to get rid of libgit2 though, because Git itself
doesn't have a good way to support server side merges and rebases,
while libgit2 has ways to perform them. Without server side merges and
rebases, those platforms would have to launch some kind of checkout,
which can be very expensive, before any merge or rebase.

The latest discussions on this topic following the 2nd Virtual
Summit[1] ended with some proposals around a `git merge-tree` on
steroids that could be a good solution to this issue.

The current `git merge-tree` command though seems to have a number of
issues, especially:

  - it's too much related to the old merge recursive strategy which is
    not the default anymore since v2.34.0 and is likely to be
    deprecated over time,

  - it seems to output things in its own special format, which is not
    easy to customize, and which needs special code and logic to parse
I agree we don't want those...but why would new merge-tree options
have to use the old merge strategy or the old output format?
To move forward on this, this small RFC patch series introduces a new
`git merge-tree-ort` command with the following design:
Slightly dislike the command name.  `ort` was meant as a temporary,
internal name.  I don't think it's very meaningful to users, so I was
hoping to just make `recursive` mean `ort` after we had enough
testing, and to delete merge-recursive.[ch] at that time.  Then `ort`
merely becomes a historical footnote (...and perhaps part of the name
of the file where the `recursive` algorithm is implemented).
  - it uses merge-ort's API as is to perform the merge

  - it gets back a tree oid and a cleanliness status from merge-ort's
    API and prints them out first
Good so far.
  - it uses diff's API as is to output changed paths and code

  - the diff API, actually diff_tree_oid() is called 3 times: once for
    the diff versus branch1 ("ours"), once for the diff versus branch2
    ("theirs"), and once for the diff versus the base.
Why?  That seems to be a performance penalty for anyone that doesn't
want/need the diffs, and since we return a tree, a caller can go and
get whatever diffs they like.
Therefore:

  - its code is very simple and very easy to extend and customize, for
    example by passing diff or merge-ort options that the code would
    just pass on to the merge-ort and diff APIs respectively

  - its output can easily be parsed using simple code
These points are good.
    and existing diff parsers

This of course means that merge-tree-ort's output is not backward
compatible with merge-tree's output, but it doesn't seem that there is
much value in keeping the same output anyway. On the contrary
merge-tree's output is likely to hold us back already.

The first patch in the series adds the new command without any test
and documentation.

The second patch in the series adds a few tests that let us see how
the command's output looks like in different very simple cases.

Of course if this approach is considered valuable, I plan to add some
documentation, more tests and very likely a number of options before
submitting the next iteration.
Was there something you didn't like about
https://lore.kernel.org/git/pull.1114.git.git.1640927044.gitgitgadget@gmail.com/ (local)?
I am not sure that it's worth showing the 3 diffs (versus branch1,
branch2 and base) by default. Maybe by default no diff at all should
be shown and the command should have --branch1 (or --ours), --branch2
(or --theirs) and --base options to ask for such output, but for an
RFC patch I thought it would be better to output the 3 diffs so that
people get a better idea of the approach this patch series is taking.
I think not showing, neither by default or at all would be better.
All three of these are things users could easily generate for
themselves with the tree we return.  I'm curious, though, what's the
usecase for wanting these specific diffs?

Two things you didn't return that users cannot get any other way: (1)
conflict and warning messages, (2) list of conflicted paths.
[1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.2110211147490.56@tvgsbejvaqbjf.bet/ (local)


Christian Couder (2):
  merge-ort: add new merge-tree-ort command
  merge-ort: add t/t4310-merge-tree-ort.sh

 .gitignore                |   1 +
 Makefile                  |   1 +
 builtin.h                 |   1 +
 builtin/merge-tree-ort.c  |  93 ++++++++++++++++++++++
 git.c                     |   1 +
 t/t4310-merge-tree-ort.sh | 162 ++++++++++++++++++++++++++++++++++++++
 6 files changed, 259 insertions(+)
 create mode 100644 builtin/merge-tree-ort.c
 create mode 100755 t/t4310-merge-tree-ort.sh

--
2.34.1.433.g7bc349372a.dirty
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help