Thread (30 messages) 30 messages, 5 authors, 2017-05-09

Re: [PATCH 0/2] Make diff plumbing commands respect the indentHeuristic.

From: Ævar Arnfjörð Bjarmason <hidden>
Date: 2017-05-09 07:59:00

On Tue, May 9, 2017 at 5:16 AM, Jeff King [off-list ref] wrote:
On Mon, May 01, 2017 at 12:34:38PM +0200, Ævar Arnfjörð Bjarmason wrote:
quoted
quoted
I don't know if we would want to be extra paranoid about patch-ids.
There is no helping:

  git rev-list HEAD | git diff-tree --stdin -p | git patch-id --stable

because diff-tree doesn't know that it's trying for "--stable" output.
But the diffs we compute internally for patch-id could disable the
heuristics. I'm not sure if those matter, though. AFAIK those are used
only for internal comparisons within a single program. I.e., we never
compare them against input from the user, nor do we output them to the
user. So they'll change, but I don't think anybody would care.
I have a few-million row table with commit_id as one column & patch_id
as another. I.e. a commit -> patch_id mapping.
Thanks for this data point. It's always interesting to hear about
unforeseen uses of the tools.

Out of curiosity, how do you generate the patch-ids? Is it with
something like diff-tree piped to patch-id?
This:

    my $cmd = qq[git --git-dir="$repository_path" log --since="$since"
--until="$until" --all --pretty=format:%H --binary | git patch-id];
    open my $patch_id_fh, " $cmd |";

Which is part of a loop that generates since/until for continuous
pull/insertion. Also, a few lines later there's a workaround for the
git.git bug of patch-id being ^0+$ (fixed in 2485eab55c
("git-patch-id: do not trip over "no newline" markers", 2011-02-17)),
which gives you a sense of how long it's been since anyone's touched
this.
I do feel a bit sad about breaking this case (or at the very least
forcing you to set an option to retain cross-version compatibility). But
my gut says that we don't want to lock ourselves into never changing the
diff algorithm (and I'm sure we've done it inadvertently a few times
over the years; even the recent switch to turning on renames would have
had that impact).
As noted I think it's completely fine to change the patch-ids by
changing the diff algorithm.

I'm about to give some more detail on this in the other thread, but I
find that on our repos the indent heuristic changes the patch-id for
around 2% of patches, which seems fairly typical for non-changelog-y
code. You *then* need to be using topic branches you didn't delete as
well as having authored such a patch for this change to kick in, so
the impact is really minimal.

Even if it somehow changed 100% of the ids that would be fine too. It
would auto-heal as the same git version started reading & inserting
the ids, which are only relevant in a moving window.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help