Thread (224 messages) 224 messages, 7 authors, 2018-04-06

Re: Reduce pack-objects memory footprint?

From: Jeff King <hidden>
Date: 2018-03-02 10:54:23

On Fri, Mar 02, 2018 at 05:18:45PM +0700, Duy Nguyen wrote:
On Wed, Feb 28, 2018 at 4:27 PM, Duy Nguyen [off-list ref] wrote:
quoted
linux-2.6.git current has 6483999 objects. "git gc" on my poor laptop
consumes 1.7G out of 4G RAM, pushing lots of data to swap and making
all apps nearly unusuable (granted the problem is partly Linux I/O
scheduler too). So I wonder if we can reduce pack-objects memory
footprint a bit.
Next low hanging fruit item:

struct revindex_entry {
        off_t offset;
        unsigned int nr;
};

We need on entry per object, so 6.5M objects * 16 bytes = 104 MB. If
we break this struct apart and store two arrays of offset and nr in
struct packed_git, we save 4 bytes per struct, 26 MB total.

It's getting low but every megabyte counts for me, and it does not
look like breaking this struct will make horrible code (we recreate
the struct at find_pack_revindex()) so I'm going to do this too unless
someone objects. There will be slight performance regression due to
cache effects, but hopefully it's ok.
Maybe you will prove me wrong, but I don't think splitting them is going
to work. The point of the revindex_entry is that we sort the (offset,nr)
tuple as a unit.

Or are you planning to sort it, and then copy the result into two
separate arrays? I think that would work, but it sounds kind of nasty
(arcane code, and extra CPU work for systems that don't care about the
26MB).

-Peff
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help