Thread (53 messages) 53 messages, 3 authors, 2026-02-24

Re: [PATCH 07/17] refs: speed up `refs_for_each_glob_ref_in()`

From: Karthik Nayak <hidden>
Date: 2026-02-23 08:27:17

Patrick Steinhardt [off-list ref] writes:
The function `refs_for_each_glob_ref_in()` can be used to iterate
through all refs in a specific prefix with globbing. The logic to handle
this is currently hosted by `refs_for_each_glob_ref_in()`, which sets up
a callback function that knows to filter out refs that _don't_ match the
given globbing pattern.

The way we do this is somewhat inefficient though: even though the
function is expected to only yield refs in the given prefix, we still
end up iterating through _all_ references, regardless of whether or not
their name matches the given prefix.
So currently instead of relying on the backends to do the prefix
matching, the function uses its own callback to do the prefix matching.
Extend `refs_for_each_ref_ext()` so that it can handle patterns and
adapt `refs_for_each_glob_ref_in()` to use it. This means we continue to
use the same callback-based infrastructure to filter individual refs via
the globbing pattern, but we can now also use the other functionality of
the `_ext()` variant.
So this change, ensures we don't do the filtering for prefix match
ourselves and allows the backend to do it.
Most importantly, this means that we now properly handle the prefix.
This results in a performance improvement when using a prefix where a
significant majority of refs exists outside of the prefix. The following
benchmark is an extreme case, with 1 million refs that exist outside the
prefix and a single ref that exists inside it:

    Benchmark 1: git rev-parse --branches=refs/heads/* (rev = HEAD~)
      Time (mean ± σ):     115.9 ms ±   0.7 ms    [User: 113.0 ms, System: 2.4 ms]
      Range (min … max):   114.9 ms … 117.8 ms    25 runs

    Benchmark 2: git rev-parse --branches=refs/heads/* (rev = HEAD)
      Time (mean ± σ):       1.1 ms ±   0.1 ms    [User: 0.3 ms, System: 0.7 ms]
      Range (min … max):     1.0 ms …   2.3 ms    2092 runs

    Summary
      git rev-parse --branches=refs/heads/* (rev = HEAD) ran
      107.01 ± 6.49 times faster than git rev-parse --branches=refs/heads/* (rev = HEAD~)
Nice. That's a really neat bump in speed.
quoted hunk ↗ jump to hunk
Signed-off-by: Patrick Steinhardt <redacted>
---
 refs.c | 69 ++++++++++++++++++++++++++++++++++++++----------------------------
 refs.h | 10 ++++++++++
 2 files changed, 50 insertions(+), 29 deletions(-)
diff --git a/refs.c b/refs.c
index ec9e466381..ac34bbe6c1 100644
--- a/refs.c
+++ b/refs.c
@@ -590,40 +590,23 @@ void normalize_glob_ref(struct string_list_item *item, const char *prefix,
 	strbuf_release(&normalized_pattern);
 }

-int refs_for_each_glob_ref_in(struct ref_store *refs, refs_for_each_cb fn,
+int refs_for_each_glob_ref_in(struct ref_store *refs, refs_for_each_cb cb,
 			      const char *pattern, const char *prefix, void *cb_data)
 {
-	struct strbuf real_pattern = STRBUF_INIT;
-	struct for_each_ref_filter filter;
-	int ret;
-
-	if (!prefix && !starts_with(pattern, "refs/"))
-		strbuf_addstr(&real_pattern, "refs/");
-	else if (prefix)
-		strbuf_addstr(&real_pattern, prefix);
-	strbuf_addstr(&real_pattern, pattern);
-
-	if (!has_glob_specials(pattern)) {
-		/* Append implied '/' '*' if not present. */
-		strbuf_complete(&real_pattern, '/');
-		/* No need to check for '*', there is none. */
-		strbuf_addch(&real_pattern, '*');
-	}
-
-	filter.pattern = real_pattern.buf;
-	filter.prefix = prefix;
-	filter.fn = fn;
-	filter.cb_data = cb_data;
-	ret = refs_for_each_ref(refs, for_each_filter_refs, &filter);
-
-	strbuf_release(&real_pattern);
-	return ret;
+	struct refs_for_each_ref_options opts = {
+		.pattern = pattern,
+		.prefix = prefix,
+	};
+	return refs_for_each_ref_ext(refs, cb, cb_data, &opts);
 }

-int refs_for_each_glob_ref(struct ref_store *refs, refs_for_each_cb fn,
+int refs_for_each_glob_ref(struct ref_store *refs, refs_for_each_cb cb,
 			   const char *pattern, void *cb_data)
 {
-	return refs_for_each_glob_ref_in(refs, fn, pattern, NULL, cb_data);
+	struct refs_for_each_ref_options opts = {
+		.pattern = pattern,
+	};
+	return refs_for_each_ref_ext(refs, cb, cb_data, &opts);
 }

 const char *prettify_refname(const char *name)
@@ -1862,16 +1845,44 @@ int refs_for_each_ref_ext(struct ref_store *refs,
 			  refs_for_each_cb cb, void *cb_data,
 			  const struct refs_for_each_ref_options *opts)
 {
+	struct strbuf real_pattern = STRBUF_INIT;
+	struct for_each_ref_filter filter;
 	struct ref_iterator *iter;
+	int ret;

 	if (!refs)
 		return 0;

+	if (opts->pattern) {
+		if (!opts->prefix && !starts_with(opts->pattern, "refs/"))
+			strbuf_addstr(&real_pattern, "refs/");
+		else if (opts->prefix)
+			strbuf_addstr(&real_pattern, opts->prefix);
+		strbuf_addstr(&real_pattern, opts->pattern);
+
+		if (!has_glob_specials(opts->pattern)) {
+			/* Append implied '/' '*' if not present. */
+			strbuf_complete(&real_pattern, '/');
+			/* No need to check for '*', there is none. */
+			strbuf_addch(&real_pattern, '*');
+		}
+
+		filter.pattern = real_pattern.buf;
+		filter.prefix = opts->prefix;
Can't we now remove this option and cleanup `for_each_filter_refs()` to
remove prefix trimming?
quoted hunk ↗ jump to hunk
+		filter.fn = cb;
+		filter.cb_data = cb_data;
+
+		cb = for_each_filter_refs;
+		cb_data = &filter;
+	}
+
 	iter = refs_ref_iterator_begin(refs, opts->prefix ? opts->prefix : "",
 				       opts->exclude_patterns,
 				       opts->trim_prefix, opts->flags);

-	return do_for_each_ref_iterator(iter, cb, cb_data);
+	ret = do_for_each_ref_iterator(iter, cb, cb_data);
+	strbuf_release(&real_pattern);
+	return ret;
 }

 int refs_for_each_ref(struct ref_store *refs, refs_for_each_cb cb, void *cb_data)
diff --git a/refs.h b/refs.h
index bb9c64a51c..a66dbf3865 100644
--- a/refs.h
+++ b/refs.h
@@ -458,6 +458,16 @@ struct refs_for_each_ref_options {
 	/* Only iterate over references that have this given prefix. */
 	const char *prefix;

+	/*
+	 * A globbing pattern that can be used to only yield refs that match.
+	 * If given, refs will be matched against the pattern with
+	 * `wildmatch()`.
+	 *
+	 * If the pattern doesn't contain any globbing characters then it is
+	 * treated as if it was ending with "/" and "*".
+	 */
+	const char *pattern;
+
 	/*
 	 * Exclude any references that match any of these patterns on a
 	 * best-effort basis. The caller needs to be prepared for the exclude

--
2.53.0.414.gf7e9f6c205.dirty

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help