Re: [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk`
From: Derrick Stolee <hidden>
Date: 2026-05-29 21:28:34
On 5/29/2026 4:07 PM, Taylor Blau wrote:
On Fri, May 29, 2026 at 01:26:33PM -0400, Derrick Stolee wrote:quoted
On 5/28/26 11:28 AM, Derrick Stolee wrote:quoted
On 5/27/26 7:18 PM, Taylor Blau wrote:quoted
Do you have any end-to-end performance data to demonstrate that these changes are effective at scale? Are we still producing packfiles with the pack-file compression and now with .bitmap files? How does this impact the performance of a clone or fetch when using a bitmap index at read time?Here's my attempt to use our existing performance tests to analyze the impact of this series. Running p5311 against the base of this topic and this topic with GIT_TEST_PACK_PATH_WALK=1, I get this output:Yikes. That's not great, but see below for what I think is going on. (As an aside, we can focus in on either lookup=true or lookup=false, since these are just controlling whether or not the bitmap lookup table is written. On a repository as small as git.git, this shouldn't make a huge difference either way. I have a separate series to make this the default and to clean up the t/perf suite accordingly, but haven't sent it to the list yet.)quoted
Test HEAD~3 HEAD ----------------------------------------------------------------- [...]I think this is either the primary reason why you're not seeing an improvement here, or at least related to it...quoted
Do you have a good feeling for why the path-walk feature doesn't make a huge change in these test scenarios?I think the problem is that we're relying on the TEST_ variable to tell pack-objects to generate a pack using --path-walk, but treat it as a fallback. I suspect that since p5311 invokes repack *without* the '--path-walk' option, we end up in this case within cmd_pack_objects(): if (path_walk < 0) { if (use_bitmap_index > 0 || !use_internal_rev_list) path_walk = 0; else if (the_repository->gitdir && the_repository->settings.pack_use_path_walk) path_walk = 1; else path_walk = git_env_bool("GIT_TEST_PACK_PATH_WALK", 0); } , where `path_walk` is *not* set, but `use_bitmap_index` is since p5311 set the 'pack.writeBitmaps' configuration (as a separate aside, this should really prefer the non-deprecated 'repack.writeBitmaps' variant).
I see. When `--use-bitmap-index` is specified, then the test variable is ignored. So my tests aren't actually measuring the intended end state.
quoted hunk ↗ jump to hunk
So in that case, we fall back to `path_walk = 0`, and don't even bother reading the `GIT_TEST_PACK_PATH_WALK` variable. If I modify the perf test like so:--- 8< --- diff --git a/t/perf/p5311-pack-bitmaps-fetch.sh b/t/perf/p5311-pack-bitmaps-fetch.sh index 047efb995d6..1c9c99216e3 100755 --- a/t/perf/p5311-pack-bitmaps-fetch.sh +++ b/t/perf/p5311-pack-bitmaps-fetch.sh@@ -13,7 +13,7 @@ test_fetch_bitmaps () { test_expect_success 'create bitmapped server repo' ' git config pack.writebitmaps true && git config pack.writeBitmapLookupTable '"$1"' && - git repack -ad + git repack -ad --path-walk ' # simulate a fetch from a repository that last fetched N days ago, for --- >8 ---, then I can get significantly improved results when running without the GIT_TEST_PACK_PATH_WALK variablle (here I'm truncating the 'lookup=false' case, which performs nearly identically): Test HEAD~3 HEAD ------------------------------------------------------------------------------------ 5311.4: server (1 days) (lookup=true) 2.57(2.52+0.04) 0.03(0.02+0.00) -98.8% 5311.5: size (1 days) 153.4K 153.4K +0.0% 5311.6: client (1 days) (lookup=true) 0.00(0.01+0.00) 0.00(0.01+0.00) = 5311.8: server (2 days) (lookup=true) 2.60(2.55+0.04) 0.02(0.02+0.00) -99.2% 5311.9: size (2 days) 153.4K 153.4K +0.0% 5311.10: client (2 days) (lookup=true) 0.00(0.01+0.00) 0.00(0.01+0.00) = 5311.12: server (4 days) (lookup=true) 2.60(2.54+0.05) 0.03(0.03+0.00) -98.8% 5311.13: size (4 days) 209.0K 209.0K +0.0% 5311.14: client (4 days) (lookup=true) 0.01(0.02+0.00) 0.01(0.01+0.00) +0.0% 5311.16: server (8 days) (lookup=true) 2.58(2.53+0.04) 0.03(0.03+0.00) -98.8% 5311.17: size (8 days) 209.0K 209.0K +0.0% 5311.18: client (8 days) (lookup=true) 0.01(0.01+0.00) 0.01(0.02+0.00) +0.0% 5311.20: server (16 days) (lookup=true) 2.58(2.52+0.05) 0.03(0.03+0.00) -98.8% 5311.21: size (16 days) 209.0K 209.0K +0.0% 5311.22: client (16 days) (lookup=true) 0.01(0.02+0.00) 0.01(0.01+0.00) +0.0% 5311.24: server (32 days) (lookup=true) 2.61(2.58+0.03) 0.03(0.02+0.01) -98.9% 5311.25: size (32 days) 212.9K 212.9K +0.0% 5311.26: client (32 days) (lookup=true) 0.01(0.02+0.00) 0.02(0.02+0.00) +100.0% 5311.28: server (64 days) (lookup=true) 2.72(2.79+0.06) 0.19(0.30+0.03) -93.0% 5311.29: size (64 days) 4.5M 4.5M -0.0% 5311.30: client (64 days) (lookup=true) 0.49(0.58+0.02) 0.48(0.56+0.04) -2.0% 5311.32: server (128 days) (lookup=true) 2.90(3.21+0.09) 0.35(0.70+0.04) -87.9% 5311.33: size (128 days) 9.4M 9.5M +0.4% 5311.34: client (128 days) (lookup=true) 0.98(1.27+0.08) 0.98(1.33+0.06) +0.0% My reading here is that we get significantly smaller packs (i.e. all 'test_size' tests drop from HEAD~3 to HEAD) in the same amount of time (i.e. that all 'test_perf' tests are roughly flat).
The sizes don't shrink, and in one case increases by a small amount. I'm happy to count those cases as noise from multi-threaded delta calculations being less deterministic. The _time_ taken to compute the packfiles is what decreases, though, which is promising.
That lines up with my expectation here, which is that even though we're using bitmaps at read time, that's effectively seeding the verbatim pack reuse over a significantly smaller pack, producing a much smaller output pack as a result.
Can you double-check this reasoning with my read of the data? The size isn't changing, but the computation time is.
quoted hunk ↗ jump to hunk
As to whether we should modify the perf suite to test this, naturally I think we should. Likely that looks like modifying p5313 to re-run `test_all_with_args` with '--use-bitmap-index' after repacking with --path-walk and generating bitmaps like so (untested):--- 8< --- diff --git a/t/perf/p5313-pack-objects.sh b/t/perf/p5313-pack-objects.sh index 46a6cd32d24..663717982b1 100755 --- a/t/perf/p5313-pack-objects.sh +++ b/t/perf/p5313-pack-objects.sh@@ -22,6 +22,21 @@ test_expect_success 'create rev input' ' EOF ' +test_repack_with_args () { + args="$@" + export args + + test_perf "repack with $args" ' + git repack -adf $args + ' + + test_size "repack size with $args" ' + gitdir=$(git rev-parse --git-dir) && + pack=$(ls $gitdir/objects/pack/pack-*.pack) && + test_file_size "$pack" + ' +} +
I see that these tests are extracted from test_all_... below:
quoted hunk ↗ jump to hunk
test_all_with_args () { parameter=$1 export parameter@@ -52,23 +67,22 @@ test_all_with_args () { test_size "shallow pack size with $parameter" ' test_file_size out ' - - test_perf "repack with $parameter" ' - git repack -adf $parameter - ' - - test_size "repack size with $parameter" ' - gitdir=$(git rev-parse --git-dir) && - pack=$(ls $gitdir/objects/pack/pack-*.pack) && - test_file_size "$pack" - ' }
Because the --use-bitmap-index and --write-bitmap-index args aren't appropriate across these different commands. nit: the diff would be more obvious if test_repack_with_args was defined after test_all_with_args so the hunk of existing tests wouldn't appear in the diff.
for version in 1 2 do - test_all_with_args --name-hash-version=$version + arg="--name-hash-version=$version" && + + test_all_with_args "$arg" && + test_repack_with_args "$arg" || return 1 done test_all_with_args --path-walk +test_repack_with_args --path-walk + +# inverted order here: we want to test using reachability bitmaps on a +# pack written with --path-walk +test_repack_with_args --path-walk --write-bitmap-index +test_all_with_args --use-bitmap-index
So this allows us to test all of the different modes.
quoted hunk ↗ jump to hunk
--- >8 ---I don't have a strong opinion on whether or not we should include that in this series or elsewhere.
I'm interested to see some results of your new p5313 test to make sure that we are getting expected size changes for the repack, since the p5311 tests were more focused on the thin fetch pack (and didn't show a change in size). For that, I'd be interested to see this test be included in a patch for future reference, too. Thanks, -Stolee