Re: [PATCH 04/12] bloom: clear each bloom_key after use
From: SZEDER Gábor <hidden>
Date: 2021-04-11 07:26:57
On Fri, Apr 09, 2021 at 06:47:23PM +0000, Andrzej Hunt via GitGitGadget wrote:
From: Andrzej Hunt <redacted> fill_bloom_key() allocates memory into bloom_key, we need to clean that up once the key is no longer needed. This fixes the following leak which was found while running t0002-t0099. Although this leak is happening in code being called from a test-helper, the same code is also used in various locations around git, and could presumably happen during normal usage too.
It does indeed happen: 'git commit-graph write --reachable
--changed-paths' generates Bloom filters for every commit, with each
filter containing all paths modified by its associated commit, so it
leaks a lot of 7 * 4byte hashes. This patch reduces the memory usage
of that command:
Max RSS
before after
---------------------------------------------
android-base 1275028k 1006576k -21.1%
chromium 3245144k 3127764k -3.6%
cmssw 793996k 699156k -12.0%
cpython 371584k 343480k -7.6%
elasticsearch 748104k 637936k -14.7%
freebsd-src 819020k 741272k -9.5%
gcc 867412k 730332k -15.8%
gecko-dev 2619112k 2457280k -6.2%
git 252684k 216900k -14.2%
glibc 239000k 222228k -7.0%
go 264132k 251344k -4.9%
homebrew-cask 542188k 480588k -11.4%
homebrew-core 805332k 715848k -11.1%
jdk 417832k 342928k -17.9%
libreoff-core 1257296k 1089980k -13.3%
linux 2033296k 1759712k -13.5%
llvm-project 1067216k 956704k -10.4%
mariadb-srv 695172k 559508k -19.5%
postgres 340132k 317416k -6.7%
rails 325432k 294332k -9.6%
rust 655244k 584904k -10.7%
tensorflow 507308k 480848k -5.2%
webkit 2466812k 2237332k -9.3%
Just out of curiosity, I disabled the questionable hardcoded 512 paths
limit on the size of modified path Bloom filters, and the memory usage
in the jdk repository sunk by over 55%, from 849520k to 379760k.
Please feel free to include any of the above data points in the commit
message.
quoted hunk ↗ jump to hunk
Direct leak of 308 byte(s) in 11 object(s) allocated from: #0 0x49a5e2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3 #1 0x6f4032 in xcalloc wrapper.c:140:8 #2 0x4f2905 in fill_bloom_key bloom.c:137:28 #3 0x4f34c1 in get_or_compute_bloom_filter bloom.c:284:4 #4 0x4cb484 in get_bloom_filter_for_commit t/helper/test-bloom.c:43:11 #5 0x4cb072 in cmd__bloom t/helper/test-bloom.c:97:3 #6 0x4ca7ef in cmd_main t/helper/test-tool.c:121:11 #7 0x4caace in main common-main.c:52:11 #8 0x7f798af95349 in __libc_start_main (/lib64/libc.so.6+0x24349) SUMMARY: AddressSanitizer: 308 byte(s) leaked in 11 allocation(s). Signed-off-by: Andrzej Hunt <redacted> --- bloom.c | 1 + 1 file changed, 1 insertion(+)diff --git a/bloom.c b/bloom.c index 52b87474c6eb..5e297038bb1f 100644 --- a/bloom.c +++ b/bloom.c@@ -283,6 +283,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, struct bloom_key key; fill_bloom_key(e->path, strlen(e->path), &key, settings); add_key_to_filter(&key, filter, settings); + clear_bloom_key(&key); } cleanup:-- gitgitgadget