Re: [PATCH 0/5] [RFC] config API: return empty list, not NULL
From: Derrick Stolee <hidden>
Date: 2022-09-28 18:38:27
On 9/27/2022 10:40 PM, Junio C Hamano wrote:
"Derrick Stolee via GitGitGadget" [off-list ref] writes:quoted
This work changes the behavior of asking for a multi-valued config key to return an empty list instead of a NULL value. This simplifies the handling of the result and is safer for development in the future. This is based on v4 of my unregister series [1] [1] https://lore.kernel.org/git/pull.1358.v4.git.1664287021.gitgitgadget@gmail.com/ (local) This idea came about due to a bug in the git maintenance unregister work where the result from git_config_get_value_multi() was sent directly to for_each_string_list_item() without checking for a NULL value first. I'm sending this as an RFC mostly because I'm not 100% sure this shift is worth the refactoring pain and effort. I personally think getting an empty list is a safer choice, but I can also understand if someone has a different opinion.Thanks. I actually am in favor of the idea that a NULL can be passed around to signal the lack of a string_list (or the lack of a instance of any "collection" type), and the current code is structured as such, and it gives us extra flexibility. Of course, we need to see if that extra flexibility is worth it. With a colleciton col, "if (col && col->nr)" checks if we have something to work on. But a code like this (which is a longhand for the for_each_string_list_item() issue we just reencountered): col = git_get_some_collection(...); if (!col) return; /* no collection */ if (!col->nr) git_add_to_some_collection(col, the default item); for (i = 0; i < col->nr; i++) do things on col.stuff[i]; can react differently to cases where we have an empty collection and where we do not have any collection to begin with. The other side of the coin is that it would make it harder to treat the lack of collection itself and the collection being empty the same way. The above code might need to become col = git_get_some_collection(...); if (!col) col = git_get_empty_collection(); if (!col->nr) git_add_to_some_collection(col, the default item); for (i = 0; i < col->nr; i++) do things on col.stuff[i]; but if the "get the collection" thing returns an empty collection when there is actually no collection, we can lose two lines from there.
I'm all for conveying more information when possible, but how can the config API provide a distinction between an empty list and a NULL list? The only thing I can think about is a case where the empty value clears the list and no new values are added, such as [bogus "key"] item = one item = two item = With this, the key exists in the config file, but the multi-valued list is empty. Is that an important distinction? I don't think so.
Between these two positions, I am in favor of the flexibility that we can use NULL to signal the lack of collection, not a presence of a collection with zero items, as it feels conceptually cleaner. Counting the hunks in [2/5] and [5/5], it seems that "when no variable with given key is defined, we return an empty set" makes us to have more code in 7 places in [PATCH 2/5], while allowing us to lose code in 10 places in [PATCH 5/5].
Outside of the "if (sl && sl->nr) {" that I forgot to convert into
"if (sl->nr) {" in patch 5, all of those 7 places that have "more code"
end up with only that extra "->nr". The code looks uglier temporarily
in patch 2 to create the compatibility mode so patch 4 can change only
the config API without test failures.
Thanks,
-Stolee