Thread (7 messages) 7 messages, 3 authors, 2023-05-09

Re: [PATCH v2] diff: fix interaction between the "-s" option and other options

From: Felipe Contreras <hidden>
Date: 2023-05-09 00:38:20

Possibly related (same subject, not in this thread)

Junio C Hamano wrote:
Sergey Organov noticed and reported "--patch --no-patch --raw"
behaves differently from just "--raw".  It turns out that there are
a few interesting bugs in the implementation and documentation.

 * First, the documentation for "--no-patch" was unclear that it
   could be read to mean "--no-patch" countermands an earlier
   "--patch" but not other things.  The intention of "--no-patch"
   ever since it was introduced at d09cd15d (diff: allow --no-patch
   as synonym for -s, 2013-07-16) was to serve as a synonym for
   "-s", so "--raw --patch --no-patch" should have produced no
   output, but it can be (mis)read to allow showing only "--raw"
   output.
I would say that is orthogonal.
 * Then the interaction between "-s" and other format options were
   poorly implemented.  Modern versions of Git uses one bit each to
   represent formatting options like "--patch", "--stat" in a single
   output_format word, but for historical reasons, "-s" also is
   represented as another bit in the same word.  This allows two
   interesting bugs to happen, and we have both X-<.

   (1) After setting a format bit, then setting NO_OUTPUT with "-s",
       the code to process another "--<format>" option drops the
       NO_OUTPUT bit to allow output to be shown again.  However,
       the code to handle "-s" only set NO_OUTPUT without unsetting
s/only set/only sets/
       format bits set earlier, so the earlier format bit got
       revealed upon seeing the second "--<format>" option.  This is
       the problem Sergey observed.

   (2) After setting NO_OUTPUT with "-s", code to process
s/code/the code/
       "--<format>" option can forget to unset NO_OUTPUT, leaving
       the command still silent.
It is tempting to change the meaning of "--no-patch" to mean
"disable only the patch format output" and reimplement "-s" as "not
showing anything", but it would be an end-user visible change in
behavior.
Yes, it would be a change in behavior from what no reasonable user would
expect, to what most reasonable users would expct.

These are synonyms:

 1.a. git diff --patch-with-raw
 1.b. git diff --patch --raw

And so should these:

 2.a. git diff --raw
 2.b. git diff --no-patch --raw

But who on Earth would then think these are different?

 2.b. git diff --no-patch --raw
 2.c. git diff --raw --no-patch

Your patch is *already* an end-user visible change in behavior, so why
not do the end-user visible change in behavior that reasonable users
would expect?
Let's fix the interactions of these bits to first make "-s" work as
intended.
Is it though?
The fix is conceptually very simple.

 * Whenever we set DIFF_FORMAT_FOO because we saw the "--foo"
   option (e.g. DIFF_FORMAT_RAW is set when the "--raw" option is
   given), we make sure we drop DIFF_FORMAT_NO_OUTPUT.  We forgot to
   do so in some of the options and caused (2) above.

 * When processing "-s" option, we should not just set
   DIFF_FORMAT_NO_OUTPUT bit, but clear other DIFF_FORMAT_* bits.
   We didn't do so and retained format bits set by options
   previously seen, causing (1) above.

It is even more tempting to lose NO_OUTPUT bit and instead take
output_format word being 0 as its replacement, but that would break
the mechanism "git show" uses to default to "--patch" output, where
the distinction between telling the command to be silent with "-s"
and having no output format specified on the command line matters,
and an explicit output format given on the command line should not
be "combined" with the default "--patch" format.
That's because the logic is not correct, the default should not be 0,
the default should be a different value, for example
DIFF_FORMAT_DEFAULT, that way each tool can update DIFF_FORMAT_DEFAULT
to whatever default is desired.

Then 0 doesn't mean default, it means NO_OUTPUT, and then removing all
the formats--including DIFF_FORMAT_DEFAULT--makes it clear what the user
intends to do.
So, while we cannot lose the NO_OUTPUT bit, as a follow-up work, we
may want to replace it with OPTION_GIVEN bit, and

 * make "--patch", "--raw", etc. set DIFF_FORMAT_$format bit and
   DIFF_FORMAT_OPTION_GIVEN bit on for each format.  "--no-raw",
   etc. will set off DIFF_FORMAT_$format bit but still record the
   fact that we saw an option from the command line by setting
   DIFF_FORMAT_OPTION_GIVEN bit.

 * make "-s" (and its synonym "--no-patch") clear all other bits
   and set only the DIFF_FORMAT_OPTION_GIVEN bit on.

which I suspect would make the code much cleaner without breaking
any end-user expectations.
Why DIFF_FORMAT_OPTION_GIVEN? DIFF_FORMAT_DEFAULT as the opposite is
much more understandable.
Once that is in place, transitioning "--no-patch" to mean the
counterpart of "--patch", just like "--no-raw" only defeats an
earlier "--raw", would be quite simple at the code level.
It's not only simple, it's a no-op, as (~DIFF_FORMAT_PATCH |
DIFF_FORMAT_OPTION_GIVEN) becomes indistinguishible from
DIFF_FORMAT_NO_OUTPUT unless another optin like DIFF_FORMAT_RAW is
specified.

I'm sending a patch series that shows that to be the case.

-- 
Felipe Contreras
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help