Re: [RFC/PATCH] add-patch: handle splitting hunks with diff.suppressBlankEmpty
From: Phillip Wood <hidden>
Date: 2024-07-10 13:46:32
Subsystem:
the rest · Maintainer:
Linus Torvalds
Possibly related (same subject, not in this thread)
- 2024-07-11 · Re: [RFC/PATCH] add-patch: handle splitting hunks with diff.suppressBlankEmpty · Jeff King <hidden>
- 2024-07-10 · Re: [RFC/PATCH] add-patch: handle splitting hunks with diff.suppressBlankEmpty · Junio C Hamano <hidden>
Hi Peff On 10/07/2024 10:36, Jeff King wrote:
Subject: add-patch: handle splitting hunks with diff.suppressBlankEmpty
When "add -p" parses diffs, it looks for context lines starting with a single space. But when diff.suppressBlankEmpty is in effect, an empty context line will omit the space, giving us a true empty line. This confuses the parser, which is unable to split based on such a line.
It's tempting to say that we should just make sure that we generate a diff without that option. But we may parse diffs not only generated by Git, but ones that users have manually edited. And POSIX calls the decision of whether to print the space here "implementation-defined".
Do we ever parse an edited hunk with this code? I'm not sure there is a way of splitting a hunk after it has been edited as I don't think we ever display it again.
So let's handle both cases: a context line either starts with a space or
consists of a totally empty line.
Reported-by: Ilya Tumaykin <redacted>
Signed-off-by: Jeff King <redacted>
---
I'm a little worried that this creates ambiguities, since I don't think
we are careful about following the hunk header's line counts. Imagine
you had an input like this:
@@ -1,2 +1,2 @@>
-old
+new
stuff
some garbage
We obviously know that "some garbage" is not a context line and is just
trailing junk, because it does not begin with "-", "+" or space. But
what about the blank line in between? It looks like an empty context
line, but we can only know that it is not by respecting the counts in
the hunk header.
I don't think we'd ever generate this ourselves, but could somebody
manually edit a hunk into this shape? When I tried it in practice, it
looks like we fail to apply the result even before my patch, though. I'm> not sure why that is. If I put "some garbage" without the blank line, we
correctly realize it should be discarded. It's possible I'm just holding
it wrong.When we recount the hunk after it has been edited we ignore lines that don't begin with '+', '-', ' ', or '\n' so if you add some garbage at the end of the hunk the recounted hunk header excludes it when it gets applied. I think your patch looks good. I did wonder if we wanted to fix this by normalizing context lines instead as shown in the diff below. That might make it less likely to miss adding "|| '\n'" in future code that is looking for a context line but I don't have a strong preference either way. Best Wishes Phillip ---- >8 ----
diff --git a/add-patch.c b/add-patch.c
index d8ea05ff108..795aa772b7a 100644
--- a/add-patch.c
+++ b/add-patch.c@@ -400,6 +400,12 @@ static void complete_file(char marker, struct hunk *hunk) hunk->splittable_into++; } +/* Empty context lines may omit the leading ' ' */ +static int normalize_marker(char marker) +{ + return marker == '\n' ? ' ' : marker; +} + static int parse_diff(struct add_p_state *s, const struct pathspec *ps) { struct strvec args = STRVEC_INIT;
@@ -485,6 +491,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps) while (p != pend) { char *eol = memchr(p, '\n', pend - p); const char *deleted = NULL, *mode_change = NULL; + char ch = normalize_marker(*p); if (!eol) eol = pend;
@@ -532,7 +539,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps) * Start counting into how many hunks this one can be * split */ - marker = *p; + marker = ch; } else if (hunk == &file_diff->head && starts_with(p, "new file")) { file_diff->added = 1;
@@ -586,10 +593,10 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps) (int)(eol - (plain->buf + file_diff->head.start)), plain->buf + file_diff->head.start); - if ((marker == '-' || marker == '+') && *p == ' ') + if ((marker == '-' || marker == '+') && ch == ' ') hunk->splittable_into++; - if (marker && *p != '\\') - marker = *p; + if (marker && ch != '\\') + marker = ch; p = eol == pend ? pend : eol + 1; hunk->end = p - plain->buf;
@@ -953,7 +960,7 @@ static int split_hunk(struct add_p_state *s, struct file_diff *file_diff, context_line_count = 0; while (splittable_into > 1) { - ch = s->plain.buf[current]; + ch = normalize_marker(s->plain.buf[current]); if (!ch) BUG("buffer overrun while splitting hunks");
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 5d78868ac16..385c246baf0 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh@@ -1164,4 +1164,31 @@ test_expect_success 'reset -p with unmerged files' ' test_must_be_empty staged ' +test_expect_success 'hunk splitting works with diff.suppressBlankEmpty' ' + cat >expect <<-\EOF && + diff --git a/file b/file + index 777b174..ebc9684 100755 + --- a/file + +++ b/file + @@ -2,6 +2,6 @@ p + q + r + + -d + -e + -f + +s + +t + +u + EOF + + test_config diff.suppressBlankEmpty true && + test_write_lines a b c "" d e f >file && + git add file && + test_write_lines p q r "" s t u >file && + test_write_lines s y n q | git add -p && + git diff >actual && + diff_cmp expect actual +' + test_done