Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
From: Elijah Newren <hidden>
Date: 2022-09-23 01:47:27
On Thu, Sep 22, 2022 at 12:50 PM Johannes Schindelin [off-list ref] wrote:
Hi Elijah, now that I have studied more of the code, hunted down Coverity reports about `merge-ort.c` and `builtin/merge-tree.c` and essentially have grown a lot more confidence about my patch, I'll take the time to respond in a bit more detail. On Wed, 21 Sep 2022, Elijah Newren wrote:quoted
On Wed, Sep 21, 2022 at 3:08 PM Johannes Schindelin via GitGitGadget [off-list ref] wrote:quoted
From: Johannes Schindelin <redacted>
[...]
quoted
quoted
@@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o, if (o->show_messages == -1) o->show_messages = !result.clean; - printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination); + tree_oid = result.tree ? &result.tree->object.oid : null_oid(); + printf("%s%c", oid_to_hex(tree_oid), line_termination);Perhaps also print a warning to stderr when result.tree is NULL?I ended up not even touching `builtin/merge-tree.c` but instead ensuring that `result.clean` is negative if we fail to write an object (which could happen even in read/write repositories, think "disk full"). As a consequence, we do not even reach this `printf()` anymore, as you pointed out, a negative `result.clean` is handled much earlier.
Works for me. :-)
It is handled via a `die()` in `real_merge()`, and that will need to change, I think, if we want to continue on the batched merges.
Indeed, good point.
quoted
quoted
if (!result.clean) { struct string_list conflicted_files = STRING_LIST_INIT_NODUP; const char *last = NULL;@@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o, &result); } merge_finalize(&opt, &result); - return !result.clean; /* result.clean < 0 handled above */ + return !result.tree || !result.clean; /* result.clean < 0 handled above */Thinking out loud, should this logic be at the merge-ort.c level, perhaps something like this:@@ -4940,6 +4941,9 @@ static voidmerge_ort_nonrecursive_internal(struct merge_options *opt, result->tree = parse_tree_indirect(&working_tree_oid); /* existence of conflicted entries implies unclean */ result->clean &= strmap_empty(&opt->priv->conflicted); + if (!result->tree) + /* This shouldn't happen, because if we did fail to write a tree we should have returned early before getting here. But just in case we have some bugs... */ + result->clean = -1; if (!opt->priv->call_depth) { result->priv = opt->priv; result->_properly_initialized = RESULT_INITIALIZED; That might benefit callers other than merge-tree, though maybe it makes it harder to print a helpful error message (unless we're fine with the library always throwing one?)The error messages are already thrown about (this is how it looks like with v3 of this patch): [...] + git -C read-only merge-tree side1 side2 error: insufficient permission for adding an object to repository database ./objects error: error: Unable to add numbers to database error: insufficient permission for adding an object to repository database ./objects error: error: Unable to add greeting to database
Ah, this confirms my suspicions. These lines look like two failures, both for adding blob objects. This shows that the return value from the write_object_file() call within handle_content_merge() is not being correctly propagated upwards, and the merge is (erroneously) continuing on despite that.
error: insufficient permission for adding an object to repository database ./objects
And I'm guessing the lack of the "Unable to add %s to database" following message means this was the attempt to write a new tree.
fatal: failure to merge
And this shows that with your patch, you are propagating and catching the failure to write the tree object and aborting, so we at least catch failures to write trees now. [...]
quoted
And then, possibly post-v2.38.0 though we may be able to get it in before, getting correct propagation of a -1 return value from the source of the error would be good. Would you like to look into that, or would you prefer I did?It turned out not to be too bad. Essentially, to propagate `write_object_file()` failures I only had to change a couple of signatures (with the corresponding `return`s): `write_tree()`, `write_completed_directory()`, and `process_entries()`. And then, of course, I needed to change `merge_ort_nonrecursive_internal()` to respect the propagated error by setting `result->clean = -1`. Pretty straight-forward, really, much less involved than I had expected.
Nice!