Re: [PATCH 08/12] merge-ort: provide a merge_get_conflicted_files() helper function
From: Johannes Schindelin <hidden>
Date: 2022-02-25 16:31:33
Hi Elijah, On Tue, 22 Feb 2022, Elijah Newren wrote:
Sidenote: Do you lump in binary merge conflicts with "non-semantic merge conflicts"? You would by your definition, but I'm not sure it matches. I tend to call things either content-based conflicts or path-based conflicts, where content-based usually means textual-based but also includes merges of binaries.
I like "content-based conflicts". And no, I had not even thought about binary merge conflicts yet...
On Mon, Feb 21, 2022 at 2:46 AM Johannes Schindelin [off-list ref] wrote:quoted
Concretely: while I am not currently aware of any web UI that allows to resolve simple rename/rename conflicts, it is easily conceivable how to implement such a thing. When that happens, we will need to be able to teach the server-side code to discern between the cases that can be handled in the web UI (trivial merge conflicts, trivial rename/rename conflicts) as compared to scenarios where the conflicts are just too complex.Um, I'm really worried about attempting to make the conflict notices machine parseable. I don't like that idea at all, and I even tried to rule that out already with my wording: """ In all cases, the <Informational messages> section has the necessary info, though it is not designed to be machine parseable. """ though maybe I should have been even more explicit. The restrictions that those messages be stable is too rigid, I think. I also think they're a poor way to communicate information to a higher level tool. I would much rather us add some kind of additional return data structures from merge ort and use them if we want extra info.
Okay. I thought that we could keep the `CONFLICT (<type>)` constant enough to serve as such a machine-parseable thing. And then presenting `<path>NUL<message>NUL` could have served my use case well...
quoted
Here's an excerpt from t4301: -- snip -- Auto-merging greeting CONFLICT (content): Merge conflict in greeting Auto-merging numbers CONFLICT (file/directory): directory in the way of whatever from side1; moving it to whatever~side1 instead. CONFLICT (modify/delete): whatever~side1 deleted in side2 and modified in side1. Version side1 of whatever~side1 left in tree. -- snap -- This is the complete set of messages provided in the `test conflict notices and such` test case. I immediately notice that every line contains at least one file name. Looking at https://github.com/git/git/blob/v2.35.1/merge-ort.c#L1899, it does not seem as if the file names are quoted: path_msg(opt, path, 1, _("Auto-merging %s"), path); (where `path` is used verbatim in a call to `merge_3way()` before that, i.e. it must not have been quoted) I would like to register a wish to ensure that file names with special characters (such as most notably line-feed characters) are quoted in these messages, so that a simple server-side parser can handle messages starting with `Auto-merging` and with `CONFLICT (content): Merge conflict in `, and "throw the hands up in the air" if any other message prefix is seen. Do you think we can switch to `sq_quote_buf_pretty()` for these messages? For the `Auto-merging` one, it would be trivial, but I fear that we will have to work a bit on the `path_msg()` function (https://github.com/git/git/blob/v2.35.1/merge-ort.c#L630-L649) because it accepts a variable list of arguments without any clue whether the arguments refer to paths or not. (And I would be loathe to switch _all_ callers to do the quoting themselves.) I see 28 calls to that function, and at least a couple that pass not only a path but also an OID (e.g. https://github.com/git/git/blob/v2.35.1/merge-ort.c#L1611-L1613). We could of course be sloppy and pass even OIDs through `sq_quote_buf_pretty()` in `path_msg()`, knowing that there won't be any special characters in them, but it gets more complicated e.g. in https://github.com/git/git/blob/v2.35.1/merge-ort.c#L1648-L1651, where we pass an `strbuf` that contains a somewhat free-form commit message. I guess we could still pass those through `sq_quote_buf_pretty()`, even if they are not paths, to ensure that there are no special characters in the machine-parseable lines. What do you think?Switching to single quoting paths as a matter of style might make sense, but only if we go through and change every caller to do so so that we can make sure it applies to all paths. And only paths and not OIDs.
Yes, that sounds unappealing.
But I'm going to reserve the right in merge-ort to modify, add, or delete any of those messages passed to path_msg(), which might wreak havoc on your attempts to parse those strings. I think they're a bad form for communicating information to a script or program, and trying to transform them into such risks making them suboptimal at communicating info to humans. These messages should optimize the latter, and if we want something for the former, it should probably be a new independent bit of info.
Makes sense. So we need something in addition to those messages. Ciao, Dscho