Re: [PATCH 5/7] commit-graph: document file format v2

[PATCH 0/7] Commit-graph: Generation Number v2 Fixes, v3 implementation · Derrick Stolee via GitGitGadget <hidden> · 2022-02-24
[PATCH 2/7] commit-graph: fix ordering bug in generation numbers · Derrick Stolee via GitGitGadget <hidden> · 2022-02-24
[PATCH 1/7] test-read-graph: include extra post-parse info · Derrick Stolee via GitGitGadget <hidden> · 2022-02-24
[PATCH 3/7] commit-graph: start parsing generation v2 (again) · Derrick Stolee via GitGitGadget <hidden> · 2022-02-24
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Patrick Steinhardt <hidden> · 2022-02-28
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Derrick Stolee <hidden> · 2022-02-28
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Patrick Steinhardt <hidden> · 2022-02-28
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Derrick Stolee <hidden> · 2022-02-28
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Patrick Steinhardt <hidden> · 2022-03-01
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Patrick Steinhardt <hidden> · 2022-03-01
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Derrick Stolee <hidden> · 2022-03-01
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Patrick Steinhardt <hidden> · 2022-03-01
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Derrick Stolee <hidden> · 2022-03-01
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Patrick Steinhardt <hidden> · 2022-03-02
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Derrick Stolee <hidden> · 2022-03-02
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Patrick Steinhardt <hidden> · 2022-03-03
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Derrick Stolee <hidden> · 2022-03-03
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Derrick Stolee <hidden> · 2022-03-04
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Patrick Steinhardt <hidden> · 2022-03-07
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Derrick Stolee <hidden> · 2022-03-07
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Patrick Steinhardt <hidden> · 2022-03-10
Re: [PATCH 3/7] commit-graph: start parsing generation v2 (again) · Derrick Stolee <hidden> · 2022-03-10
[PATCH 4/7] commit-graph: fix generation number v2 overflow values · Derrick Stolee via GitGitGadget <hidden> · 2022-02-24
[PATCH 5/7] commit-graph: document file format v2 · Derrick Stolee via GitGitGadget <hidden> · 2022-02-24
Re: [PATCH 5/7] commit-graph: document file format v2 · Ævar Arnfjörð Bjarmason <hidden> · 2022-02-25
Re: [PATCH 5/7] commit-graph: document file format v2 · Derrick Stolee <hidden> · 2022-02-28
Re: [PATCH 5/7] commit-graph: document file format v2 · Ævar Arnfjörð Bjarmason <hidden> · 2022-02-28
Re: [PATCH 5/7] commit-graph: document file format v2 · Derrick Stolee <hidden> · 2022-02-28
Re: [PATCH 5/7] commit-graph: document file format v2 · Ævar Arnfjörð Bjarmason <hidden> · 2022-02-28
Re: [PATCH 5/7] commit-graph: document file format v2 · Derrick Stolee <hidden> · 2022-03-01
Re: [PATCH 5/7] commit-graph: document file format v2 · Ævar Arnfjörð Bjarmason <hidden> · 2022-03-01
Re: [PATCH 5/7] commit-graph: document file format v2 · Derrick Stolee <hidden> · 2022-03-01
[PATCH 6/7] commit-graph: parse file format v2 · Derrick Stolee via GitGitGadget <hidden> · 2022-02-24
[PATCH 7/7] commit-graph: write file format v2 · Derrick Stolee via GitGitGadget <hidden> · 2022-02-24
[PATCH v2 0/4] Commit-graph: Generation Number v2 Fixes · Derrick Stolee via GitGitGadget <hidden> · 2022-02-28
[PATCH v2 1/4] test-read-graph: include extra post-parse info · Derrick Stolee via GitGitGadget <hidden> · 2022-02-28
Re: [PATCH v2 1/4] test-read-graph: include extra post-parse info · Ævar Arnfjörð Bjarmason <hidden> · 2022-02-28
[PATCH v2 2/4] commit-graph: fix ordering bug in generation numbers · Derrick Stolee via GitGitGadget <hidden> · 2022-02-28
Re: [PATCH v2 2/4] commit-graph: fix ordering bug in generation numbers · Ævar Arnfjörð Bjarmason <hidden> · 2022-02-28
[PATCH v2 3/4] commit-graph: start parsing generation v2 (again) · Derrick Stolee via GitGitGadget <hidden> · 2022-02-28
Re: [PATCH v2 3/4] commit-graph: start parsing generation v2 (again) · Ævar Arnfjörð Bjarmason <hidden> · 2022-02-28
Re: [PATCH v2 3/4] commit-graph: start parsing generation v2 (again) · Derrick Stolee <hidden> · 2022-02-28
[PATCH v2 4/4] commit-graph: fix generation number v2 overflow values · Derrick Stolee via GitGitGadget <hidden> · 2022-02-28
Re: [PATCH v2 4/4] commit-graph: fix generation number v2 overflow values · Ævar Arnfjörð Bjarmason <hidden> · 2022-02-28
Re: [PATCH v2 0/4] Commit-graph: Generation Number v2 Fixes · Ævar Arnfjörð Bjarmason <hidden> · 2022-03-01
[PATCH v3 0/5] Commit-graph: Generation Number v2 Fixes · Derrick Stolee via GitGitGadget <hidden> · 2022-03-01
[PATCH v3 3/5] commit-graph: fix ordering bug in generation numbers · Derrick Stolee via GitGitGadget <hidden> · 2022-03-01
[PATCH v3 5/5] commit-graph: fix generation number v2 overflow values · Derrick Stolee via GitGitGadget <hidden> · 2022-03-01
[PATCH v3 4/5] commit-graph: start parsing generation v2 (again) · Derrick Stolee via GitGitGadget <hidden> · 2022-03-01
[PATCH v3 1/5] test-read-graph: include extra post-parse info · Derrick Stolee via GitGitGadget <hidden> · 2022-03-01
[PATCH v3 2/5] t5318: extract helpers to lib-commit-graph.sh · Derrick Stolee via GitGitGadget <hidden> · 2022-03-01

From: Derrick Stolee <hidden>
Date: 2022-02-28 13:44:48

On 2/25/2022 5:31 PM, Ævar Arnfjörð Bjarmason wrote:

On Thu, Feb 24 2022, Derrick Stolee via GitGitGadget wrote:

...

quoted

   Generation Data (ID: {'G', 'D', 'A', 'T' }) (N * 4 bytes) [Optional]
     * This list of 4-byte values store corrected commit date offsets for the

@@ -103,6 +112,9 @@ CHUNK DATA:
     * Generation Data chunk is present only when commit-graph file is written
       by compatible versions of Git and in case of split commit-graph chains,
       the topmost layer also has Generation Data chunk.
+    * This chunk does not exist if the commit-graph file format version is 2,
+      because the corrected commit date offset data is stored in the Commit
+      Data chunk.
 
   Generation Data Overflow (ID: {'G', 'D', 'O', 'V' }) [Optional]
     * This list of 8-byte values stores the corrected commit date offsets

We talked a while ago now about how we do commit-graph format changes
and this is partially echoing those earlier questions[1] from 2019.

I fully understand why we're writing this amended CDAT chunk in a
different layout. By not having the GDAT side-chunk to look up in the
data is more local, that part of the file is more compact etc.

What I don't understand is why getting those performance improvements
requires the breaking version change & the writing of the incompatible
version number.

I.e. couldn't the differently formatted CDAT chunk be written instead to a new
chunk name (say "2DAT") instead? Per [1] we'd pay a small fixed cost for
a possibly empty chunk (I didn't re-do those numbers), but surely the
performance improvements will be about the same for that miniscule
overhead.

CDAT is a required chunk. It is part of the v1 spec that CDAT exists
and is correct. All other Git clients will error out when reading a
"v1" graph without such a chunk, and in a way that is less helpful to
users. Instead of clearly indicating "file version is too new" it will
say "commit-graph is missing the Commit Data chunk" which is not
helpful.

It will give you something you can't have here, which is optional
compatibility with older clients by writing both versions. That'll be a
~2x as large file on disk, but with the page cache & each client version
skipping to the data it needs caching characteristics & data locality
should work out to about the same thing.

Writing both is the only way that this could work without incrementing
the graph version number, but I'd rather just update the number and
avoid wasting the effort to write that extra data.

It seems you are hyper-focused on "we don't _need_ to update the version
number" and you are willing to recommend wasteful approaches in order to
support that stance.

So: you're right. We don't _need_ to update the version number. But this
is the best choice among the options available.

Or maybe they won't. I just found it surprising when reviewing this to
not find an answer to why that approach wasn't
considered.

The point is to create a new format that can be chosen when deployed
in an environment where older Git versions will not exist (such as
a Git server). The new version is not chosen by default and instead
is opt-in through the commitGraph.generationVersion config option.

Perhaps in a year or two we would consider making this the new
default, but there is no rush to do so.

Thanks,
-Stolee

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help