Re: [PATCH v6 4/4] last-modified: use Bloom filters when available
From: Patrick Steinhardt <hidden>
Date: 2025-08-04 06:33:14
On Fri, Aug 01, 2025 at 06:23:08PM +0200, Toon Claes wrote:
Patrick Steinhardt [off-list ref] writes:quoted
On Wed, Jul 30, 2025 at 07:55:10PM +0200, Toon Claes wrote:quoted
diff --git a/builtin/last-modified.c b/builtin/last-modified.c index e4c73464c7..19bf25f8a5 100644 --- a/builtin/last-modified.c +++ b/builtin/last-modified.c@@ -179,6 +192,27 @@ static void last_modified_diff(struct diff_queue_struct *q, } } +static int maybe_changed_path(struct last_modified *lm, struct commit *origin) +{ + struct bloom_filter *filter; + struct last_modified_entry *ent; + struct hashmap_iter iter; + + if (!lm->rev.bloom_filter_settings) + return 1; + + filter = get_bloom_filter(lm->rev.repo, origin); + if (!filter) + return 1; + + hashmap_for_each_entry(&lm->paths, &iter, ent, hashent) { + if (bloom_filter_contains(filter, &ent->key, + lm->rev.bloom_filter_settings)) + return 1; + } + return 0; +}This function is basically the same as `maybe_changed_paths()` in "blame.c", but that isn't a huge issue from my point of view. What makes me wonder though is why we have an additional check over there for whether or not the commit has a valid generation number.I've been asking me the same question. And I couldn't find a good reason (neither from the commit history, or from my reasoning). This check was in the version shared by Taylor, but because we were ignoring the return value from generation_numbers_enabled() in that version, it didn't make sense to me to do this check. That's why I removed it.
Okay. It might make sense to point this out in the commit message.
quoted
quoted
@@ -227,6 +264,9 @@ static int last_modified_init(struct last_modified *lm, struct repository *r, return argc; } + prepare_commit_graph(lm->rev.repo); + lm->rev.bloom_filter_settings = get_bloom_filter_settings(lm->rev.repo); +So this here is why we export `prepare_commit_graph()`. How about we instead expose `bloom_filters_enabled()` that mirrors what we do in `generation_numbers_enabled()` and `corrected_commit_dates_enabled()`? That would both be on a higher level and do exactly what we want to achieve.I've got another proposal, what if we let get_bloom_filter_settings() call prepare_commit_graph()? Functions like repo_find_commit_pos_in_graph() and lookup_commit_in_graph() do this too.
Yeah, I don't see any issue with that, either. Patrick