Thread (79 messages) 79 messages, 10 authors, 2025-08-05

Re: [PATCH v6 4/4] last-modified: use Bloom filters when available

From: Patrick Steinhardt <hidden>
Date: 2025-08-04 06:33:14

On Fri, Aug 01, 2025 at 06:23:08PM +0200, Toon Claes wrote:
Patrick Steinhardt [off-list ref] writes:
quoted
On Wed, Jul 30, 2025 at 07:55:10PM +0200, Toon Claes wrote:
quoted
diff --git a/builtin/last-modified.c b/builtin/last-modified.c
index e4c73464c7..19bf25f8a5 100644
--- a/builtin/last-modified.c
+++ b/builtin/last-modified.c
@@ -179,6 +192,27 @@ static void last_modified_diff(struct diff_queue_struct *q,
 	}
 }
 
+static int maybe_changed_path(struct last_modified *lm, struct commit *origin)
+{
+	struct bloom_filter *filter;
+	struct last_modified_entry *ent;
+	struct hashmap_iter iter;
+
+	if (!lm->rev.bloom_filter_settings)
+		return 1;
+
+	filter = get_bloom_filter(lm->rev.repo, origin);
+	if (!filter)
+		return 1;
+
+	hashmap_for_each_entry(&lm->paths, &iter, ent, hashent) {
+		if (bloom_filter_contains(filter, &ent->key,
+					  lm->rev.bloom_filter_settings))
+			return 1;
+	}
+	return 0;
+}
This function is basically the same as `maybe_changed_paths()` in
"blame.c", but that isn't a huge issue from my point of view. What makes
me wonder though is why we have an additional check over there for
whether or not the commit has a valid generation number.
I've been asking me the same question. And I couldn't find a good reason
(neither from the commit history, or from my reasoning). This check was
in the version shared by Taylor, but because we were ignoring the return
value from generation_numbers_enabled() in that version, it didn't make
sense to me to do this check. That's why I removed it.
Okay. It might make sense to point this out in the commit message.
quoted
quoted
@@ -227,6 +264,9 @@ static int last_modified_init(struct last_modified *lm, struct repository *r,
 		return argc;
 	}
 
+	prepare_commit_graph(lm->rev.repo);
+	lm->rev.bloom_filter_settings = get_bloom_filter_settings(lm->rev.repo);
+
So this here is why we export `prepare_commit_graph()`. How about we
instead expose `bloom_filters_enabled()` that mirrors what we do in
`generation_numbers_enabled()` and `corrected_commit_dates_enabled()`?
That would both be on a higher level and do exactly what we want to
achieve.
I've got another proposal, what if we let get_bloom_filter_settings()
call prepare_commit_graph()? Functions like
repo_find_commit_pos_in_graph() and lookup_commit_in_graph() do this
too. 
Yeah, I don't see any issue with that, either.

Patrick
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help