Re: [PATCH 7/9] ref-filter.c: filter & format refs in the same callback
From: Patrick Steinhardt <hidden>
Date: 2023-11-07 10:50:01
On Tue, Nov 07, 2023 at 01:25:59AM +0000, Victoria Dye via GitGitGadget wrote:
quoted hunk ↗ jump to hunk
From: Victoria Dye <redacted> Update 'filter_and_format_refs()' to try to perform ref filtering & formatting in a single ref iteration, without an intermediate 'struct ref_array'. This can only be done if no operations need to be performed on a pre-filtered array; specifically, if the refs are - filtered on reachability, - sorted, or - formatted with ahead-behind information they cannot be filtered & formatted in the same iteration. In that case, fall back on the current filter-then-sort-then-format flow. This optimization substantially improves memory usage due to no longer storing a ref array in memory. In some cases, it also dramatically reduces runtime (e.g. 'git for-each-ref --no-sort --count=1', which no longer loads all refs into a 'struct ref_array' to printing only the first ref). Signed-off-by: Victoria Dye <redacted> --- ref-filter.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 74 insertions(+), 6 deletions(-)diff --git a/ref-filter.c b/ref-filter.c index ff00ab4b8d8..384cf1595ff 100644 --- a/ref-filter.c +++ b/ref-filter.c@@ -2863,6 +2863,44 @@ static void free_array_item(struct ref_array_item *item) free(item); } +struct ref_filter_and_format_cbdata { + struct ref_filter *filter; + struct ref_format *format; + + struct ref_filter_and_format_internal { + int count; + } internal; +}; + +static int filter_and_format_one(const char *refname, const struct object_id *oid, int flag, void *cb_data) +{ + struct ref_filter_and_format_cbdata *ref_cbdata = cb_data; + struct ref_array_item *ref; + struct strbuf output = STRBUF_INIT, err = STRBUF_INIT; + + ref = apply_ref_filter(refname, oid, flag, ref_cbdata->filter); + if (!ref) + return 0; + + if (format_ref_array_item(ref, ref_cbdata->format, &output, &err)) + die("%s", err.buf); + + if (output.len || !ref_cbdata->format->array_opts.omit_empty) { + fwrite(output.buf, 1, output.len, stdout); + putchar('\n'); + } + + strbuf_release(&output); + strbuf_release(&err); + free_array_item(ref); + + if (ref_cbdata->format->array_opts.max_count && + ++ref_cbdata->internal.count >= ref_cbdata->format->array_opts.max_count) + return -1;
It feels a bit weird to return a negative value here, which usually indicates that an error has happened whereas we only use it here to abort the iteration. But we ignore the return value of `do_iterate_refs()` anyway, so it doesn't make much of a difference.
quoted hunk ↗ jump to hunk
+ return 0; +} + /* Free all memory allocated for ref_array */ void ref_array_clear(struct ref_array *array) {@@ -3046,16 +3084,46 @@ int filter_refs(struct ref_array *array, struct ref_filter *filter, unsigned int return ret; } +static inline int can_do_iterative_format(struct ref_filter *filter, + struct ref_sorting *sorting, + struct ref_format *format) +{ + /* + * Refs can be filtered and formatted in the same iteration as long + * as we aren't filtering on reachability, sorting the results, or + * including ahead-behind information in the formatted output. + */
Do we want to format this as a bulleted list so that it's more readily extensible if we ever need to pay attention to new options here? Also, I noted that this commit doesn't add any new tests -- do we already exercise all of these conditions? More generally, I worry a bit about maintainability of this code snippet as we need to remember to always update this condition whenever we add a new option, and this can be quite easy to miss. The performance benefit might be worth the effort though. Patrick
+ return !(filter->reachable_from ||
+ filter->unreachable_from ||
+ sorting ||
+ format->bases.nr);
+}
+
void filter_and_format_refs(struct ref_filter *filter, unsigned int type,
struct ref_sorting *sorting,
struct ref_format *format)
{
- struct ref_array array = { 0 };
- filter_refs(&array, filter, type);
- filter_ahead_behind(the_repository, format, &array);
- ref_array_sort(sorting, &array);
- print_formatted_ref_array(&array, format);
- ref_array_clear(&array);
+ if (can_do_iterative_format(filter, sorting, format)) {
+ int save_commit_buffer_orig;
+ struct ref_filter_and_format_cbdata ref_cbdata = {
+ .filter = filter,
+ .format = format,
+ };
+
+ save_commit_buffer_orig = save_commit_buffer;
+ save_commit_buffer = 0;
+
+ do_filter_refs(filter, type, filter_and_format_one, &ref_cbdata);
+
+ save_commit_buffer = save_commit_buffer_orig;
+ } else {
+ struct ref_array array = { 0 };
+ filter_refs(&array, filter, type);
+ filter_ahead_behind(the_repository, format, &array);
+ ref_array_sort(sorting, &array);
+ print_formatted_ref_array(&array, format);
+ ref_array_clear(&array);
+ }
}
static int compare_detached_head(struct ref_array_item *a, struct ref_array_item *b)
--
gitgitgadget
Attachments
- signature.asc [application/pgp-signature] 833 bytes