Re: Slow git pack-refs --all

From: Patrick Steinhardt <hidden>
Date: 2026-01-07 11:43:02

On Tue, Jan 06, 2026 at 11:02:19PM +0000, Martin Fick wrote:

quoted

From: Patrick Steinhardt <redacted> Sent: Monday, January 5, 2026 11:53 PM
On Mon, Jan 05, 2026 at 11:45:41PM +0000, Martin Fick wrote:

quoted

OK, after discovering the strace -r and -T options, I have determined that
the 29K writes were all very fast in themselves. However, most of the
writes seem to follow each other with no other system calls in between.
This explains why it looks like the writes are slow, even though they aren't.

If I tally up the time between the previous system call, and each write(),
it adds up to the bulk of the time (4mins out of 4m15s) that it takes to
pack refs. This tells me that no visible I/O or system calls are the problem,
but rather that the program itself is taking a long time between writes.
I very much doubt that this is heavy CPU time, but rather I am going to
guess that this is hidden system time spent accessing mmaped memory.
Could it be really slow reading the packed-refs file? I can see the
packed-refs file is mmaped() before the writes start, and then
munmapped after the writes are completed. If I had to guess, that likely
means that the packed-refs file is being read in small increments by the
kernel via mmap, and that is what is making things very slow over NFS.

I wouldn't be surprised if NFS was the culprit. At GitLab we found it to
be a constant source of issues, which is why we eventually sunsetted the
use of it completely. Do you use any special flags for mounting the NFS
filesystem?

I am open to alternatives to NFS. Do you know of any NFS alternatives that 
provides instantaneous replication to potentially hundreds of mirrors? I 
have used Gerrit and git-daemon for many years on NFS, and it generally 
has performed very well for us, and it solves many real performance issues 
which I have yet to find a viable alternative able to even come close to
matching. NFS with all it warts it is for us (and likely will be for many) until 
there is a viable enterprise ready alternative with low (zero) replication 
latency and high throughput.

Yeah, agreed, NFS can get you a long way, until you eventually start to
hit some road blocks once you reach a certain scale. Unfortunately
though, there isn't really a ready-made alternative solution that serves
your needs, or at least none that I know of. That's why GitLab
eventually settled on Gitaly Cluster with Praefect handling replication,
and why GitHub has its Spokes architecture that does basically the same
thing.

That being said, NFS can cause many issues. In this case, I would say that
something is particularly "broken" here with git, and I believe that it
would be helpful to the git community to be aware of this fairly specific 
broken case which clearly has a lot of room for improvement (as seen
by the fact that jgit, in java, can do essentially the same thing more 
than 10Xs faster). While I have been mostly assuming that this is a 
particularly specific bad case since git daemon generally is fast for most
users, this might actually be something that if improved would greatly 
improve many parts of git (not just this use case).

Chances are that if we can improve the case for NFS, other filesystems
might benefit, as well. So if this is something that we can improve I
agree that we should. It's too early to tell though, as we don't really
know what the actual root cause is just yet.

It would be nice to improve git to not hold the packed-refs.lock so long 
to avoid this blocking behavior on servers. Of course, to be fair, this 
likely only blocks Gerrit servers since Gerrit uses the packed-refs file to 
perform atomic updates for many things, and most other servers use 
loose refs instead. It would be great if git were optimized to avoid any 
unnecessary reads while the lock is held.  In theory, almost all of the 
data that git needs to read here (including tags for peeling) could be 
read before acquiring the lock, and it would only need to double 
check certain reads after it acquires the lock in case things changed. 
That wouldn't make git pack-refs faster, but it would drastically 
reduce the impact of any problematic I/O by not holding the lock for 
almost the entire operation.

It can probably be improved, true. I think that it's a bit of a wasted
effort, as I'd rather invest the time into improving reftables as a more
future-proof solution. But as you are well aware I'm quite biased here,
and I'd welcome any efforts to also improve the files backend. I am just
unlikely to work on it myself :)

quoted

Did you try using perf(1) to profile the process and generate a flame
graph from it? That should likely make it immediately obvious where Git
is spending all of its time.

I will pursue this. Unfortunately this might be difficult on this 
particular server.

True, on the server side this can be a bit tricky.

Patrick

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help