Thread (25 messages) 25 messages, 5 authors, 2026-01-06

Re: [PATCH v2 1/2] builtin/gc: fix condition for whether to write commit graphs

From: Karthik Nayak <hidden>
Date: 2026-01-06 16:33:36

Patrick Steinhardt [off-list ref] writes:
On Tue, Jan 06, 2026 at 03:27:29AM -0800, Karthik Nayak wrote:
quoted
Patrick Steinhardt [off-list ref] writes:
quoted
When performing auto-maintenance we check whether commit graphs need to
be generated by counting the number of commits that are reachable by any
reference, but not covered by a commit graph. This search is performed
by iterating through all references and then doing a depth-first search
until we have found enough commits that are not present in the commit
graph.

This logic has a memory leak though:

  Direct leak of 16 byte(s) in 1 object(s) allocated from:
      #0 0x55555562e433 in malloc (git+0xda433)
      #1 0x555555964322 in do_xmalloc ../wrapper.c:55:8
      #2 0x5555559642e6 in xmalloc ../wrapper.c:76:9
      #3 0x55555579bf29 in commit_list_append ../commit.c:1872:35
      #4 0x55555569f160 in dfs_on_ref ../builtin/gc.c:1165:4
      #5 0x5555558c33fd in do_for_each_ref_iterator ../refs/iterator.c:431:12
      #6 0x5555558af520 in do_for_each_ref ../refs.c:1828:9
      #7 0x5555558ac317 in refs_for_each_ref ../refs.c:1833:9
      #8 0x55555569e207 in should_write_commit_graph ../builtin/gc.c:1188:11
      #9 0x55555569c915 in maintenance_is_needed ../builtin/gc.c:3492:8
      #10 0x55555569b76a in cmd_maintenance ../builtin/gc.c:3542:9
      #11 0x55555575166a in run_builtin ../git.c:506:11
      #12 0x5555557502f0 in handle_builtin ../git.c:779:9
      #13 0x555555751127 in run_argv ../git.c:862:4
      #14 0x55555575007b in cmd_main ../git.c:984:19
      #15 0x5555557523aa in main ../common-main.c:9:11
      #16 0x7ffff7a2a4d7 in __libc_start_call_main (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a4d7) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #17 0x7ffff7a2a59a in __libc_start_main@GLIBC_2.2.5 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a59a) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #18 0x5555555f0934 in _start (git+0x9c934)

The root cause of this memory leak is our use of `commit_list_append()`.
This function expects as parameters the item to append and the _tail_ of
the list to append. This tail will then be overwritten with the new tail
of the list so that it can be used in subsequent calls. But we call it
with `commit_list_append(parent->item, &stack)`, so we end up losing
everything but the new item.

This issue only surfaces when counting merge commits. Next to being a
memory leak, it also shows that we're in fact miscounting as we only
respect children of the last parent. All previous parents are discarded,
so their children will be disregarded unless they are hit via another
reference.
Yikes. So we never go down the path of the first N-1 parents? Does that
inversely mean, commit-graph generation would be slower now in
repositories with lots of merges, since it is fixed to follow all paths
correctly?
Yeah, this was quite broken indeed. I don't think that the fixed walk
should result in a significant slowdown:

  - We stop walking the parent chain whenever we see a commit that is
    covered by the commit-graph.
Yup that holds true.
  - And our walk of commits that are not covered by the commit-graph is
    bounded by "maintenance.commit-graph.auto", which defaults to 100
    commits.
Ah, I didn't know this. Okay so there is a sensible cap here.
So in practice, the cost should be negligible.

There's going to be some exceptions thoulgh. Most importantly, the
complexity of the computation scales directly with the number of refs as
we use `refs_for_each_ref()`. So if you have a gazillion refs I'd expect
the performance impact to become noticeable. But that's already been the
case before this commit.

We may want to revisit this in the future if we ever notice that this
does become an issue.
Fair enough. Either ways, your fixes were _required_. So glad that we
got it fixed.

[snip]

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help