Re: [PATCH v2 2/2] builtin/grep.c: integrate with sparse index
From: Derrick Stolee <hidden>
Date: 2022-08-30 13:48:05
On 8/29/2022 7:28 PM, Shaoxuan Yuan wrote:
Turn on sparse index and remove ensure_full_index(). Change it to only expands the index when using --sparse.
s/expands/expand/ These two sentences should be combined, anyway. Enable the sparse index for 'git grep', and only call ensure_full_index() when the --sparse argument is provided.
The p2000 tests demonstrate a ~99.4% execution time reduction for `git grep` using a sparse index. Test Before After ----------------------------------------------------------------------------- git grep --cached bogus (full-v3) 0.019 0.018 (-5.2%) git grep --cached bogus (full-v4) 0.017 0.016 (-5.8%) git grep --cached bogus (sparse-v3) 0.29 0.0015 (-99.4%) git grep --cached bogus (sparse-v4) 0.30 0.0018 (-99.4%)
Last time I asked that you don't present this to look like a performance test to make it clear that it is not the end-to-end process time. You removed the test numbers, but it still looks like end-to-end process time, then elaborate after the table. Instead, you can prepare the reader before the table using something like this: The p2000 tests do not demonstrate a significant improvement, because the index read is a small portion of the full process time, compared to the blob parsing. The times below reflect the time spent in the "do_read_index" trace region as shown using GIT_TRACE2_PERF=1.
- /* TODO: audit for interaction with sparse-index. */ - ensure_full_index(repo->index); + if (grep_sparse) + ensure_full_index(repo->index); +
As we've discussed, there are ways to remove even this call, but that shouldn't hold up this series which is already an improvement. Thanks, -Stolee