Thread (29 messages) 29 messages, 13 authors, 2022-07-29

RE: Feature request: provide a persistent IDs on a commit

From: Jason Pyeron <hidden>
Date: 2022-07-29 12:49:57

From: Stephen Finucane
Sent: Friday, July 29, 2022 8:11 AM

On Tue, 2022-07-19 at 13:09 +0200, Ævar Arnfjörð Bjarmason wrote:
quoted
On Tue, Jul 19 2022, Stephen Finucane wrote:
quoted
On Mon, 2022-07-18 at 20:50 +0200, Ævar Arnfjörð Bjarmason wrote:
quoted
On Mon, Jul 18 2022, Stephen Finucane wrote:
quoted
...to track evolution of a patch through time.

tl;dr: How hard would it be to retrofit an 'ChangeID' concept à la the 'Change-
ID' trailer used by Gerrit into git core?

Firstly, apologies in advance if this is the wrong forum to post a feature
request. I help maintain the Patchwork project [1], which a web-based tool that
provides a mechanism to track the state of patches submitted to a mailing list
and make sure stuff doesn't slip through the crack. One of our long-term goals
has been to track the evolution of an individual patch through multiple
revisions. This is surprisingly hard goal because oftentimes there isn't a whole
lot to work with. One can try to guess whether things are the same by inspecting
the metadata of the commit (subject, author, commit message, and the diff
itself) but each of these metadata items are subject to arbitrary changes and
are therefore fallible.

One of the mechanisms I've seen used to address this is the 'Change-ID' trailer
used by Gerrit. For anyone that hasn't seen this, the Gerrit server provides a
git commit hook that you can install locally. When installed, this appends a
'Change-ID' trailer to each and every commit message. In this way, the evolution
of a patch (or a "change", in Gerrit parlance) can be tracked through time since
the Change ID provides an authoritative answer to the question "is this still
the same patch". Unfortunately, there are still some obvious downside to this
approach. Not only does this additional trailer clutter your commit messages but
it's also something the user must install themselves. While Gerrit can insist
that this is installed before pushing a change, this isn't an option for any of
the common forges nor is it something git-send-email supports.
git format-patch+send-email will send your trailers along as-is, how
doesn't it support Change-Id. Does it need some support that any other
made-up trailer doesn't?
It supports sending the trailers, sure. What it doesn't support is insisting you
send this specific trailer (Change-Id). Only Gerrit can do this (server side,
thankfully, which means you don't need to ask all contributors to install this
hook if you want to rely on it for tooling, CI, etc.).
Ah, it's still unclear to me what you're proposing here though. That
send-email always (generates?) or otherwise insists on the trailer, that
it can be configured ot add it?

That send-email have some "pre-send-email" hook? Something else?
(Apologies for the delayed response: I was on holiday).

I'm afraid I don't have the correct terminology to describe what I'm suggesting
so I'll show an example instead.

I have configured the 'fuller' pretty formatter locally:

   $ git config format.pretty
   fuller

When I do git log on e.g. the openstack nova repo, I see:

   commit 2709e30956b53be1dca91eec801220f0efbaed93
   Author:     Stephen Finucane [off-list ref]
   AuthorDate: Thu Jul 14 15:43:40 2022 +0100
   Commit:     Stephen Finucane [off-list ref]
   CommitDate: Mon Jul 18 12:30:25 2022 +0100

       Fix compatibility with jsonschema 4.x

       This changed one of the error messages we depend on [1].

       [1] https://github.com/python-jsonschema/jsonschema/commit/641e9b8c

       Change-Id: I643ec568ee2eb2ec1a555f813fd2f1acff915afa
       Signed-off-by: Stephen Finucane [off-list ref]

(Side note: What *is the term for the "Author", "AuthorDate", "Commit" and
"CommitDate" fields? Commit header? Commit metadata? Something else?)

My thinking is there are two types of information here: information that relates
to the "commiting" of this change and information that relates to the
"authorship" of the this change. The commit ID, 'Commit' and 'CommitDate' fields
clearly form the commit parts. I'm arguing that it would be good to have an
equivalent to the commit ID field for the authorship-type metadata.

   commit 2709e30956b53be1dca91eec801220f0efbaed93
   Author:     Stephen Finucane [off-list ref]
   AuthorDate: Thu Jul 14 15:43:40 2022 +0100
   AuthorID:   I643ec568ee2eb2ec1a555f813fd2f1acff915afa
   Commit:     Stephen Finucane [off-list ref]
   CommitDate: Mon Jul 18 12:30:25 2022 +0100

       Fix compatibility with jsonschema 4.x

       This changed one of the error messages we depend on [1].

       [1] https://github.com/python-jsonschema/jsonschema/commit/641e9b8c

       Signed-off-by: Stephen Finucane [off-list ref]

At risk of repeating myself, I think this information would be valuable to allow
me to answer the question "is this the same[*] commit?". During code review,
this would allow me to track the evolution of an individual patch. Once a patch
is merged, it would allow me to track the backporting or cherry-picking of that
We have been toying with this. We are looking at a field (behaves like parent) to track "original commit".

This value would be set on first rebase, amend, cherry-pick, etc.

The bonus for us will be when we patch gerrit to consume it and git log --graph --somenewoption to use it.

It would be nice if git core did add such value.

-Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help