Thread (94 messages) 94 messages, 5 authors, 2018-02-26

Re: [PATCH v4 12/13] commit-graph: read only from specific pack-indexes

From: Stefan Beller <hidden>
Date: 2018-02-21 22:25:17

On Mon, Feb 19, 2018 at 10:53 AM, Derrick Stolee [off-list ref] wrote:
quoted hunk ↗ jump to hunk
Teach git-commit-graph to inspect the objects only in a certain list
of pack-indexes within the given pack directory. This allows updating
the commit graph iteratively, since we add all commits stored in a
previous commit graph.

Signed-off-by: Derrick Stolee <redacted>
---
 Documentation/git-commit-graph.txt | 11 +++++++++++
 builtin/commit-graph.c             | 32 +++++++++++++++++++++++++++++---
 commit-graph.c                     | 26 ++++++++++++++++++++++++--
 commit-graph.h                     |  4 +++-
 packfile.c                         |  4 ++--
 packfile.h                         |  2 ++
 t/t5318-commit-graph.sh            | 16 ++++++++++++++++
 7 files changed, 87 insertions(+), 8 deletions(-)
diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
index b9b4031..93d50d1 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -42,6 +42,10 @@ With the `--delete-expired` option, delete the graph files in the pack
 directory that are not referred to by the graph-latest file. To avoid race
 conditions, do not delete the file previously referred to by the
 graph-latest file if it is updated by the `--set-latest` option.
++
+With the `--stdin-packs` option, generate the new commit graph by
+walking objects only in the specified packfiles and any commits in
+the existing graph-head.
A general question on this series:
How do commit graph buildups deal with garbage collected commits?
(my personal workflow is heavy on rebase, which generates lots of
dangling commits, to be thrown out later)

The second half of the sentence makes it sound like once a
commit is in the graph it cannot be pulled out easily again, hence
the question on the impact of graphs on a long living repository
which is garbage collected frequently.

AFAICT you could just run
    git commit-graph write --set-latest [--delete-expired]
as that actually looks up objects from outside the existing graph files,
such that lost objects are ignored?
+       const char **lines = NULL;
+       int nr_lines = 0;
+       int alloc_lines = 0;
(nit:)
I had the impression that these triplet-variables, that are used in
ALLOC_GROW are allo X, X_nr and X_allow, but I might be wrong.
quoted hunk ↗ jump to hunk
@@ -170,7 +178,25 @@ static int graph_write(int argc, const char **argv)

        old_graph_name = get_graph_latest_contents(opts.obj_dir);

-       graph_name = write_commit_graph(opts.obj_dir);
+       if (opts.stdin_packs) {
+               struct strbuf buf = STRBUF_INIT;
+               nr_lines = 0;
+               alloc_lines = 128;
alloc_lines has been initialized before, so why redo it here again?
Also what is the rationale for choosing 128 as a good default?
I would guess 0 is just as fine, because ALLOC_GROW makes sure
that it growth fast in the first couple entries by having an additional
offset. (no need to fine tune the starting allocation IMHO)
+               ALLOC_ARRAY(lines, alloc_lines);
+
+               while (strbuf_getline(&buf, stdin) != EOF) {
+                       ALLOC_GROW(lines, nr_lines + 1, alloc_lines);
+                       lines[nr_lines++] = buf.buf;
+                       strbuf_detach(&buf, NULL);
strbuf_detach returns its previous buf.buf, such that you can combine these
two lines as
    lines[nr_lines++] = strbuf_detach(&buf, NULL);

+               }
+
+               pack_indexes = lines;
+               nr_packs = nr_lines;
Technically we do not need to strbuf_release(&buf) here, because
strbuf_detach is always called, and by knowing its implementation,
it is just as good.

quoted hunk ↗ jump to hunk
@@ -579,7 +581,27 @@ char *write_commit_graph(const char *obj_dir)
                oids.alloc = 1024;
        ALLOC_ARRAY(oids.list, oids.alloc);

-       for_each_packed_object(if_packed_commit_add_to_list, &oids, 0);
+       if (pack_indexes) {
+               struct strbuf packname = STRBUF_INIT;
+               int dirlen;
+               strbuf_addf(&packname, "%s/pack/", obj_dir);
+               dirlen = packname.len;
+               for (i = 0; i < nr_packs; i++) {
+                       struct packed_git *p;
+                       strbuf_setlen(&packname, dirlen);
+                       strbuf_addstr(&packname, pack_indexes[i]);
+                       p = add_packed_git(packname.buf, packname.len, 1);
+                       if (!p)
+                               die("error adding pack %s", packname.buf);
+                       if (open_pack_index(p))
+                               die("error opening index for %s", packname.buf);
+                       for_each_object_in_pack(p, if_packed_commit_add_to_list, &oids);
+                       close_pack(p);
+               }
strbuf_release(&packname);
+       }
+       else
(micro style nit)

    } else
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help