Re: ordered string-list considered harmful, was Re: [PATCH v3] Allow aliases that include other aliases
From: Jeff King <hidden>
Date: 2018-09-07 03:12:56
On Thu, Sep 06, 2018 at 01:54:15PM -0700, Stefan Beller wrote:
quoted
quoted
It turns out we make never use of a custom compare function in the stringlist, which helps gaining confidence this use case is nowhere to be found in the code.Plenty of code uses the default strcmp. You can find users which assume sorting by their use of string_list_insert() versus _append(). Or ones that call string_list_sort(), of course.Here comes my reading-between-the-lines assumption: When using the default comparison function, you probably only care about the efficient lookup as described above, but if you had a non-default order, then we'd have strong evidence of the contrary as the author of such code would have found reasons why that order is superior than default order (and don't tell me a different order helps making lookups even more efficient, this must be another reason).
That's a reasonable hypothesis. It looks like there are a few cases where we assign to a string_list.cmp, so I picked one arbitrarily to look at: split_maildir() uses a custom filename comparison. Despite having no recollection of this code, it appears to come from my commit 18505c3423. :) And yes, we really do care about order there, as we're trying to read the files in "maildir order". And I don't think a hash would do. That said, I don't think we care about using string-list's sorted operations here (like its binary-search lookup). It would be enough for us to generate the list (whether as string-list or no; we'd actually be happy with any array), sort it, and then iterate over the result. So I think some of these are definitely going to require some thoughtful conversion to the correct data type. Mostly what I was asking in the beginning was: does this seem like an overtly terrible line of thinking to anyone. And so far I think the answer is no. So the next step is to proceed to actually trying some conversions, and seeing what kinds of snags I hit. The devil, as usual, is in the details. :) -Peff