Re: [PATCH v2] diff: ensure consistent diff behavior with -I<regex> across output formats
From: Junio C Hamano <hidden>
Date: 2025-08-04 04:36:05
Lidong Yan [off-list ref] writes:
quoted
I do not quite get why ignore_match() has to know so much about how the real code in diff.c that implements -I<regex> works, compared to the illustration of "here is how to do it" Peff posted, though. It somehow feels too much duplicated code.I did copy some code from diffcore-pickaxe.c. I will use Peff's code in the next patch and try to refactor diff_flush() to make the code simpler. Though the reason I match the regular expression in ignore_match() is that I want to return early as soon as an unmatched change is found. And indeed, it's not worth writing the duplicated code for this unknown performance benefit.
In the production code, it would be truly worth doing the
optimization; we want to avoid running diff twice if we can.
But I think the refactoring of diff_flush() codepath would may
involve some new mode (perhaps DIFF_FORMAT_DRYRUN or something) that
(1) does not produce any output, like DIFF_FORMAT_NO_OUTPUT, so
that we do not need to play with /dev/null like Peff's
illustration.
(2) knows that the caller is only interested in each path having
any change worth reporting, so that it can short-circuit once a
change is found for each path.
So, just before you want to decide showing name or name-status,
you'd do this extra diff_flush() that is run only to learn if each
path has changes (with various "ignore" criteria) in the dry-run
mode, and it can do as much short-cut as it needs to.
Hmm?