Thread (37 messages) 37 messages, 3 authors, 2025-08-06

Re: [PATCH 5/6] merge-ort: fix incorrect file handling

From: Elijah Newren <hidden>
Date: 2025-08-04 22:09:02

On Fri, Aug 1, 2025 at 1:31 AM Patrick Steinhardt [off-list ref] wrote:
On Tue, Jul 22, 2025 at 03:23:10PM +0000, Elijah Newren via GitGitGadget wrote:
quoted
From: Elijah Newren <redacted>

We have multiple bugs here -- accidental silent file deletion,
accidental silent file retention for files that should be deleted,
and incorrect number of entries left in the index.

The series merged at commit d3b88be1b450 (Merge branch
'en/merge-dir-rename-corner-case-fix', 2021-07-16) introduced testcase
12i-12k in t6423 which checked for rename-to-self cases, and fixed bugs
that merge-ort and merge-recursive had with these testcases.  At the
time, I noted that merge-ort had one bug for these cases, while
merge-recursive had two.  It turns out that merge-ort did in fact have
another bug, but the "relevant renames" optimizations were masking it.
If we modify testcase 12i from t6423 to modify the file in the commit
that renames it (but only modify it enough that it can still be detected
as a rename), then we can trigger silent deletion of the file.

Tweak testcase 12i slightly to make the file in question have more than
one line in it, but which doesn't change how it operates.
Hm, the second part of this sentence doesn't quite parse for me. Do you
mean to say that 12i is basically left intact except that you change the
contents of one line?
Yeah, sometimes my repeated editing of text leaves things not so
clear.  You are correct that I meant leaving the testcase intact other
than changing the initial contents of one file (though I changed the
contents so it had multiple lines, not just giving it a different
single line).  I'll reword it.
quoted
Make this
change to otherwise minimize the changes between this testcase and a new
one that we want to add.  Then duplicate the testcase as 12i2, changing
it so that it adds a single line to the file in question when it is
renamed, as a testcase for this bug.
Okay.
quoted
Further, commit 98a1a00d5301 (t6423: add a testcase causing a failed
assertion in process_renames, 2025-03-06), fixed an issue with
rename-to-self but added a new testcase, 12n, that only checked for
whether the merge ran to completion.  A few commits ago, we modified
this test to check for the number of entries in the index -- but noted
that the number was wrong.  And we also noted a
silently-keep-instead-of-delete bug at the same time in the new testcase
12n2.

Fix to merge-ort to prevent multiple bugs with rename-to-self cases:
  * silent deletion of file expected to be kept (t6423 testcase 12i2)
  * silent retention of file expected to be removed (t6423 testcase 12n2)
  * wrong number of extries left in the index (t6423 testcase 12n)
I think it would have been nice to also go a bit more in depth for what
the bug actually was and how it's fixed. You do add a comment, but that
only adds a single sentence of context.
Would something like this help:

...all of these issues come up because in a rename-to-self case, when
we have a rename A->B, both A and B name the same file.  The code in
process_renames() assumes A & B are different, and tries to move the
higher order stages and file contents so that they are associated just
with the new path, but the assumptions of A & B being different can
cause A to be deleted when it's not supposed to be or mark B as
resolved and kept in place when it's supposed to be deleted.  Since A
& B are already the same path in the rename-to-self case, we can
simply skip the steps in process_renames() for such files.
quoted
Signed-off-by: Elijah Newren <redacted>
---
 merge-ort.c                         | 11 +++++
 t/t6423-merge-rename-directories.sh | 69 +++++++++++++++++++++++++++--
 2 files changed, 77 insertions(+), 3 deletions(-)
diff --git a/merge-ort.c b/merge-ort.c
index 9b9d82ed10f7..feb06720c7e1 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -2873,6 +2873,17 @@ static int process_renames(struct merge_options *opt,
                      newinfo = new_ent->value;
              }

+             /*
+              * Directory renames can result in rename-to-self, which we
+              * want to skip so we don't mark oldpath for deletion.
+              *
+              * Note that we can avoid strcmp here because of prior
+              * diligence in apply_directory_rename_modifications() to
+              * ensure we reused existing paths from opt->priv->paths.
+              */
+             if (oldpath == newpath)
+                     continue;
Makes me wonder whether the additional brittleness is worth the saved
`strcmp()` comparison. But on the other hand we do have tests now that
would break if the memory allocation patterns ever changed, so that's
reassuring.
There's no brittleness here; one of the many optimizations in
merge-ort.c was to intern *all* pathnames in struct
merge_options_internal's "paths" member; any code that needs to
generate/compute a filename that may be part of the merge must check
if that path already exists in opt->priv->paths, and if so use that
pointer instead so that all filename comparisons can be done with
cheap pointer comparisons.  See the big comment near the top of
merge_options_internal.  Nearly all such
string-equality-via-pointer-equality checks were introduced about the
same time, and in other functions, which makes this one kind of an
outlier.  I figured whoever reviewed this patch or ran across this in
the code might get surprised by the pointer comparison, so I tried to
add a comment to address any questions.  Looks like I wasn't thorough
enough (and the first paragraph of the comment pre-dated my finding
other bugs this patch fixed, which makes it slightly confusing), so
I'll try to see if I can improve it for v2.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help