Re: [PATCH 00/11] Reduce pack-objects memory footprint
From: Jeff King <hidden>
Date: 2018-03-02 10:57:24
On Fri, Mar 02, 2018 at 07:14:01AM +0700, Duy Nguyen wrote:
quoted
We have a big repo, and this gets repacked on 6-8GB of memory on dev KVMs, so we're under a fair bit of memory pressure. git-gc slows things down a lot. It would be really nice to have something that made it use drastically less memory at the cost of less efficient packs. Is the property thatAhh.. less efficient. You may be more interested in [1] then. It avoids rewriting the base pack. Without the base pack, book keeping becomes much much cheaper. We still read every single byte in all packs though (I think, unless you use pack-bitmap) and this amount of I/O affect the rest of the system too. Perhaps reducing core.packedgitwindowsize might make it friendlier to the OS, I don't know.
Yes, the ".keep" thing is actually quite expensive. We still do a complete rev-list to find all the objects we want, and then for each object say "is this in a pack with .keep?". And worse, the mru doesn't help there because even if we find it in the first pack, we have to keep looking to see if it's _another_ pack. There are probably some low-hanging optimizations there (e.g., only looking in the .keep packs if that's all we're looking for; we may even do that already). But I think fundamentally you'd do much better to generate the partial list of objects outside of pack-objects entirely, and then just feed it to pack-objects without using "--revs". -Peff