Thread (8 messages) 8 messages, 5 authors, 2019-12-03

hashmap vs khash? Re: [PATCH] packfile.c: speed up loading lots of packfiles.

From: Eric Wong <hidden>
Date: 2019-11-28 00:42:04

Colin Stolley [off-list ref] wrote:
When loading packfiles on start-up, we traverse the internal packfile
list once per file to avoid reloading packfiles that have already
been loaded. This check runs in quadratic time, so for poorly
maintained repos with a large number of packfiles, it can be pretty
slow.
Cool!  Thanks for looking into this, and I've been having
trouble in that department with big alternates files.
Add a hashmap containing the packfile names as we load them so that
the average runtime cost of checking for already-loaded packs becomes
constant.
Btw, would you have time to do a comparison against khash?

AFAIK hashmap predates khash in git; and hashmap was optimized
for removal.   Removals don't seem to be a problem for pack
loading.

I'm interested in exploring the removing of hashmap entirely in
favor of khash to keep our codebase smaller and easier-to-learn.
khash shows up more in other projects, and ought to have better
cache-locality.

Thanks again.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help