Re: Slow git pack-refs --all
From: Martin Fick <hidden>
Date: 2026-01-06 23:03:37
From: Jeff King <redacted> Sent: Tuesday, January 6, 2026 3:38 AM
On Mon, Jan 05, 2026 at 11:45:41PM +0000, Martin Fick wrote:quoted
By repacking to get one used, and one cruft pack only, and no loose objects, I have confirmed that pack-refs it is still slow. This rules out the idea that the loose object, or pack file counts were making things slow.OK, that is interesting. I'd still expect opening the objects to be the dominating factor, but now the load would be on jumping around the mmap'd packfile rather than open/read/close calls.
I believe I have confirmed this now with more testing... By first dropping the system caches, and then catting the pack file to /dev/null, it sped things up to under 20s! Note that neither catting the idx, nor the packed-refs file helped to noticeably speed things up.
quoted
OK, after discovering the strace -r and -T options, I have determined that the 29K writes were all very fast in themselves. However, most of the writes seem to follow each other with no other system calls in between. This explains why it looks like the writes are slow, even though they aren't.quoted
If I tally up the time between the previous system call, and each write(), it adds up to the bulk of the time (4mins out of 4m15s) that it takes to pack refs. This tells me that no visible I/O or system calls are the problem, but rather that the program itself is taking a long time between writes. I very much doubt that this is heavy CPU time, but rather I am going to guess that this is hidden system time spent accessing mmaped memory.That would be consistent with reading object data from the packfile. We'll jump around within the packfile to get that data.
Agreed, but boy is that really bad performance!
quoted
Could it be really slow reading the packed-refs file? I can see the packed-refs file is mmaped() before the writes start, and then munmapped after the writes are completed. If I had to guess, that likely means that the packed-refs file is being read in small increments by the kernel via mmap, and that is what is making things very slow over NFS.The packed-refs file is mmap'd, but we'll be reading it sequentially. I guess whether or not there is good read-ahead there may depend on the NFS implementation.
Yeah, ruled out now by dropping the system caches, and then catting the packed-refs file before running git pack-refs, which did NOT help speed things up.
quoted
My alternative theory, is that each ref is being looked up via a binary search, but I don't think git does this?Git does binary search within the packed-refs file, but it shouldn't be doing so here. The write-out phase of packing refs is a straight merge between two lists: the existing packed-refs entries and the new entries we are adding.
Agreed, and I should have ruled this out by realizing that this would likely not have been affected by the system caches in my earlier tests.
I'd second Patrick's suggestion to use perf or similar to try to see where the time is going.
Noted, thanks.
You might also try building Git with NO_MMAP. That might make the I/O costs more apparent via strace, because they'll be coming via pread().
Agreed, I will try to do this. I think that the jgit results hint that this this might even eliminate most of the I/O costs (jgit is not using MMAP in my tests). It would be nice if this were a runtime config instead of requiring a rebuild, as some use cases might be better with, and some without MMAP. Thanks for all the input, -Martin