Thread (2 messages) 2 messages, 2 authors, 2025-08-05

Re: [PATCH v2] diff: ensure consistent diff behavior with -I<regex> across output formats

From: Lidong Yan <hidden>
Date: 2025-08-05 09:23:56

Junio C Hamano [off-list ref] writes:
But I think the refactoring of diff_flush() codepath would may
involve some new mode (perhaps DIFF_FORMAT_DRYRUN or something) that

(1) does not produce any output, like DIFF_FORMAT_NO_OUTPUT, so
    that we do not need to play with /dev/null like Peff's
    illustration.

(2) knows that the caller is only interested in each path having
    any change worth reporting, so that it can short-circuit once a
    change is found for each path.

So, just before you want to decide showing name or name-status,
you'd do this extra diff_flush() that is run only to learn if each
path has changes (with various "ignore" criteria) in the dry-run
mode, and it can do as much short-cut as it needs to.
I’m proposing to add a .diff_optimize field to struct diff_options, which
would support three modes: DIFF_OPT_NONE, DIFF_OPT_DRY_RUN,
and DIFF_OPT_BUFFER. The appropriate value would be determined
before calling diff_flush(), potentially in repo_diff_setup().

DIFF_OPT_NONE will be the code Peff provide, DIFF_OPT_DRY_RUN
will optimize for --quiet, --name, --name-status, etc, so that we can return
early if we found any change. DIFF_OPT_BUFFER will first emit changes
and context around changes into a buffer (so there would be a map from file
pair to change buffer), then operations after the buffer is built will use the
buffer instead of calling xdl_diff().

However, I’m concerned that DIFF_OPT_BUFFER could lead to high memory
usage in Git, and I’m not entirely sure if this trade-off is justified.

Thanks,
Lidong
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help