Thread (35 messages) 35 messages, 5 authors, 2023-11-16

Re: [PATCH 7/9] ref-filter.c: filter & format refs in the same callback

From: Patrick Steinhardt <hidden>
Date: 2023-11-07 10:50:01

On Tue, Nov 07, 2023 at 01:25:59AM +0000, Victoria Dye via GitGitGadget wrote:
quoted hunk ↗ jump to hunk
From: Victoria Dye <redacted>

Update 'filter_and_format_refs()' to try to perform ref filtering &
formatting in a single ref iteration, without an intermediate 'struct
ref_array'. This can only be done if no operations need to be performed on a
pre-filtered array; specifically, if the refs are

- filtered on reachability,
- sorted, or
- formatted with ahead-behind information

they cannot be filtered & formatted in the same iteration. In that case,
fall back on the current filter-then-sort-then-format flow.

This optimization substantially improves memory usage due to no longer
storing a ref array in memory. In some cases, it also dramatically reduces
runtime (e.g. 'git for-each-ref --no-sort --count=1', which no longer loads
all refs into a 'struct ref_array' to printing only the first ref).

Signed-off-by: Victoria Dye <redacted>
---
 ref-filter.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 74 insertions(+), 6 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index ff00ab4b8d8..384cf1595ff 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2863,6 +2863,44 @@ static void free_array_item(struct ref_array_item *item)
 	free(item);
 }
 
+struct ref_filter_and_format_cbdata {
+	struct ref_filter *filter;
+	struct ref_format *format;
+
+	struct ref_filter_and_format_internal {
+		int count;
+	} internal;
+};
+
+static int filter_and_format_one(const char *refname, const struct object_id *oid, int flag, void *cb_data)
+{
+	struct ref_filter_and_format_cbdata *ref_cbdata = cb_data;
+	struct ref_array_item *ref;
+	struct strbuf output = STRBUF_INIT, err = STRBUF_INIT;
+
+	ref = apply_ref_filter(refname, oid, flag, ref_cbdata->filter);
+	if (!ref)
+		return 0;
+
+	if (format_ref_array_item(ref, ref_cbdata->format, &output, &err))
+		die("%s", err.buf);
+
+	if (output.len || !ref_cbdata->format->array_opts.omit_empty) {
+		fwrite(output.buf, 1, output.len, stdout);
+		putchar('\n');
+	}
+
+	strbuf_release(&output);
+	strbuf_release(&err);
+	free_array_item(ref);
+
+	if (ref_cbdata->format->array_opts.max_count &&
+	    ++ref_cbdata->internal.count >= ref_cbdata->format->array_opts.max_count)
+		return -1;
It feels a bit weird to return a negative value here, which usually
indicates that an error has happened whereas we only use it here to
abort the iteration. But we ignore the return value of
`do_iterate_refs()` anyway, so it doesn't make much of a difference.
quoted hunk ↗ jump to hunk
+	return 0;
+}
+
 /* Free all memory allocated for ref_array */
 void ref_array_clear(struct ref_array *array)
 {
@@ -3046,16 +3084,46 @@ int filter_refs(struct ref_array *array, struct ref_filter *filter, unsigned int
 	return ret;
 }
 
+static inline int can_do_iterative_format(struct ref_filter *filter,
+					  struct ref_sorting *sorting,
+					  struct ref_format *format)
+{
+	/*
+	 * Refs can be filtered and formatted in the same iteration as long
+	 * as we aren't filtering on reachability, sorting the results, or
+	 * including ahead-behind information in the formatted output.
+	 */
Do we want to format this as a bulleted list so that it's more readily
extensible if we ever need to pay attention to new options here? Also, I
noted that this commit doesn't add any new tests -- do we already
exercise all of these conditions?

More generally, I worry a bit about maintainability of this code snippet
as we need to remember to always update this condition whenever we add a
new option, and this can be quite easy to miss. The performance benefit
might be worth the effort though.

Patrick
+	return !(filter->reachable_from ||
+		 filter->unreachable_from ||
+		 sorting ||
+		 format->bases.nr);
+}
+
 void filter_and_format_refs(struct ref_filter *filter, unsigned int type,
 			    struct ref_sorting *sorting,
 			    struct ref_format *format)
 {
-	struct ref_array array = { 0 };
-	filter_refs(&array, filter, type);
-	filter_ahead_behind(the_repository, format, &array);
-	ref_array_sort(sorting, &array);
-	print_formatted_ref_array(&array, format);
-	ref_array_clear(&array);
+	if (can_do_iterative_format(filter, sorting, format)) {
+		int save_commit_buffer_orig;
+		struct ref_filter_and_format_cbdata ref_cbdata = {
+			.filter = filter,
+			.format = format,
+		};
+
+		save_commit_buffer_orig = save_commit_buffer;
+		save_commit_buffer = 0;
+
+		do_filter_refs(filter, type, filter_and_format_one, &ref_cbdata);
+
+		save_commit_buffer = save_commit_buffer_orig;
+	} else {
+		struct ref_array array = { 0 };
+		filter_refs(&array, filter, type);
+		filter_ahead_behind(the_repository, format, &array);
+		ref_array_sort(sorting, &array);
+		print_formatted_ref_array(&array, format);
+		ref_array_clear(&array);
+	}
 }
 
 static int compare_detached_head(struct ref_array_item *a, struct ref_array_item *b)
-- 
gitgitgadget

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help