Thread (218 messages) 218 messages, 10 authors, 2022-06-18

Re: [PATCH 08/12] merge-ort: provide a merge_get_conflicted_files() helper function

From: Johannes Schindelin <hidden>
Date: 2022-02-25 16:31:33

Hi Elijah,

On Tue, 22 Feb 2022, Elijah Newren wrote:
Sidenote: Do you lump in binary merge conflicts with "non-semantic
merge conflicts"?  You would by your definition, but I'm not sure it
matches.

I tend to call things either content-based conflicts or path-based
conflicts, where content-based usually means textual-based but also
includes merges of binaries.
I like "content-based conflicts".

And no, I had not even thought about binary merge conflicts yet...
On Mon, Feb 21, 2022 at 2:46 AM Johannes Schindelin
[off-list ref] wrote:
quoted
Concretely: while I am not currently aware of any web UI that allows
to resolve simple rename/rename conflicts, it is easily conceivable
how to implement such a thing. When that happens, we will need to be
able to teach the server-side code to discern between the cases that
can be handled in the web UI (trivial merge conflicts, trivial
rename/rename conflicts) as compared to scenarios where the conflicts
are just too complex.
Um, I'm really worried about attempting to make the conflict notices
machine parseable.  I don't like that idea at all, and I even tried to
rule that out already with my wording:
"""
In all cases, the
<Informational messages> section has the necessary info, though it is
not designed to be machine parseable.
"""
though maybe I should have been even more explicit.  The restrictions
that those messages be stable is too rigid, I think.  I also think
they're a poor way to communicate information to a higher level tool.
I would much rather us add some kind of additional return data
structures from merge ort and use them if we want extra info.
Okay.

I thought that we could keep the `CONFLICT (<type>)` constant enough to
serve as such a machine-parseable thing. And then presenting
`<path>NUL<message>NUL` could have served my use case well...
quoted
Here's an excerpt from t4301:

-- snip --
Auto-merging greeting
CONFLICT (content): Merge conflict in greeting
Auto-merging numbers
CONFLICT (file/directory): directory in the way of whatever from side1; moving it to whatever~side1 instead.
CONFLICT (modify/delete): whatever~side1 deleted in side2 and modified in side1.  Version side1 of whatever~side1 left in tree.
-- snap --

This is the complete set of messages provided in the `test conflict
notices and such` test case.

I immediately notice that every line contains at least one file name.
Looking at https://github.com/git/git/blob/v2.35.1/merge-ort.c#L1899, it
does not seem as if the file names are quoted:

                path_msg(opt, path, 1, _("Auto-merging %s"), path);

(where `path` is used verbatim in a call to `merge_3way()` before that,
i.e. it must not have been quoted)

I would like to register a wish to ensure that file names with special
characters (such as most notably line-feed characters) are quoted in these
messages, so that a simple server-side parser can handle messages starting
with `Auto-merging` and with `CONFLICT (content): Merge conflict in `, and
"throw the hands up in the air" if any other message prefix is seen.

Do you think we can switch to `sq_quote_buf_pretty()` for these messages?
For the `Auto-merging` one, it would be trivial, but I fear that we will
have to work a bit on the `path_msg()` function
(https://github.com/git/git/blob/v2.35.1/merge-ort.c#L630-L649) because it
accepts a variable list of arguments without any clue whether the
arguments refer to paths or not. (And I would be loathe to switch _all_
callers to do the quoting themselves.)

I see 28 calls to that function, and at least a couple that pass not only
a path but also an OID (e.g.
https://github.com/git/git/blob/v2.35.1/merge-ort.c#L1611-L1613).

We could of course be sloppy and pass even OIDs through
`sq_quote_buf_pretty()` in `path_msg()`, knowing that there won't be any
special characters in them, but it gets more complicated e.g. in
https://github.com/git/git/blob/v2.35.1/merge-ort.c#L1648-L1651, where we
pass an `strbuf` that contains a somewhat free-form commit message.

I guess we could still pass those through `sq_quote_buf_pretty()`, even if
they are not paths, to ensure that there are no special characters in the
machine-parseable lines.

What do you think?
Switching to single quoting paths as a matter of style might make
sense, but only if we go through and change every caller to do so so
that we can make sure it applies to all paths.  And only paths and not
OIDs.
Yes, that sounds unappealing.
But I'm going to reserve the right in merge-ort to modify, add, or
delete any of those messages passed to path_msg(), which might wreak
havoc on your attempts to parse those strings.  I think they're a bad
form for communicating information to a script or program, and trying
to transform them into such risks making them suboptimal at
communicating info to humans.  These messages should optimize the
latter, and if we want something for the former, it should probably be
a new independent bit of info.
Makes sense.

So we need something in addition to those messages.

Ciao,
Dscho
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help