Re: [PATCH v2 2/2] string-list: add string_list_sort_u() that mimics "sort -u"
From: Amisha Chhajed <hidden>
Date: 2026-01-27 01:27:28
Possibly related (same subject, not in this thread)
- 2026-01-26 · [PATCH v2 2/2] string-list: add string_list_sort_u() that mimics "sort -u" · Amisha Chhajed <hidden>
The one in builtin/fetch.c::cmd_fetch() smells somewhat fishy. It prepares a string_list "list", populates it with for_each_remote() by appending remotes found in the configuration when asked to do "--all", or append named ones with "--multiple", and then calls "remove duplicates" without sorting the resulting list first. - A test should be able to demonstrate that the call to string_list_remove_duplicates() is not operating on a sorted string list. - Once a breakage is demonstrated, we need to devise a fix. Sorting the string list before removing would certainly fix the duplicates removal, but it will change the order in which the remotes are consulted. I think it is currently "whatever order these remotes appear in your configuration file(s)", but that does not mean it is a random order. It is very likely that they are in the order the user has learned to expect the remotes are to be consulted, so "sort and then dedup" might appear as a regression in behaviour. I dunno.
was able to make the behavior fail on test, from file t5506-remote-groups.sh, test_expect_success 'group with non-adjacent duplicate remotes causes duplicate fetches (expected-to-fail)' ' mark fetch-dup && update_repos && git config --add remotes.dup one && git config --add remotes.dup two && git config --add remotes.dup one && rm -f .git/FETCH_HEAD && git -c fetch.parallel=3 remote update dup && repo_fetched two ' gave log with command, make -C t T=t5506-remote-groups.sh GIT_TEST_OPTS="-v" from git/, Fetching one Fetching two Fetching one From one d1132c8..cceba7b main -> one/main From two 72c2514..57c13f4 main -> two/main From one d1132c8..cceba7b main -> one/main error: fetching ref refs/remotes/one/main failed: incorrect old value provided could not fetch 'one' (exit code: 1) Did not see any test failures post replacing this call with string_list_sort_u() but from the code, order of the list matters, and i think the expected behaviour is if any duplicate the first call of the item should only remain but i am unsure if it is achievable with the current sort and remove duplicate methods this is more of a, if current_item in seen_set: pass else: add(list, current_item) add(seen_set, current_item) this preserves the order and maintains the time complexity and only keeps the first occurence.
The one in builtin/help.c::list_config_help() is somewhat fishy as well. I didn't read it too carefully, but it walks over keys which is in sorted string_list, and sometimes pushes the key intact to keys_uniq, and some other times munges the key and pushes the result to keys_uniq. I do not know if presence of these these munged keys in the keys_uniq string list breaks the sortedness of keys_uniq. If keys_uniq is *not* sorted, then running "remove duplicates" would be broken, of course. Again, a test should be able to demonstrate if this is the case, and we should fix it as well if it is broken.
This one is a bit more complex, there is a very specific case that would result in un-sorted behaviour of the string-list that is of form, sorted: [aa*., aa.b] post processing in &keys_uniq: [aa*, aa] (this is unsorted) because ASCII value of * < . and the precedence in our code is . < * [1] but this still works as remove_duplicates does not depend strictly on the sorted property as we remove adjacent similar elements which would also work if similar items are somehow grouped together, which they are under the conditions of our code. but sorted(x) != sorted(f(x)), and the processing we are doing also does not depend on the order of the elements so maybe shifting the sort below with remove_duplicates might work as a solution and would make the behaviour deterministic, saw no failures post shifting the sort down with it. [1] Snippet from fetch.c if (dot) cut = dot; else if (wildcard && !tag) cut = wildcard; else if (!wildcard && tag) cut = tag; else cut = wildcard < tag ? wildcard : tag;