Re: [PATCH 0/5] [RFC] config API: return empty list, not NULL

From: Derrick Stolee <hidden>
Date: 2022-09-28 18:38:27

On 9/27/2022 10:40 PM, Junio C Hamano wrote:

"Derrick Stolee via GitGitGadget" [off-list ref] writes:

quoted

This work changes the behavior of asking for a multi-valued config key to
return an empty list instead of a NULL value. This simplifies the handling
of the result and is safer for development in the future.

This is based on v4 of my unregister series [1]

[1]
https://lore.kernel.org/git/pull.1358.v4.git.1664287021.gitgitgadget@gmail.com/ (local)

This idea came about due to a bug in the git maintenance unregister work
where the result from git_config_get_value_multi() was sent directly to
for_each_string_list_item() without checking for a NULL value first.

I'm sending this as an RFC mostly because I'm not 100% sure this shift is
worth the refactoring pain and effort. I personally think getting an empty
list is a safer choice, but I can also understand if someone has a different
opinion.

Thanks.

I actually am in favor of the idea that a NULL can be passed around
to signal the lack of a string_list (or the lack of a instance of
any "collection" type), and the current code is structured as such,
and it gives us extra flexibility.  Of course, we need to see if
that extra flexibility is worth it.

With a colleciton col, "if (col && col->nr)" checks if we have
something to work on.  But a code like this (which is a longhand for
the for_each_string_list_item() issue we just reencountered):

    col = git_get_some_collection(...);
    if (!col)
	return; /* no collection */
    if (!col->nr)
	git_add_to_some_collection(col, the default item);
    for (i = 0; i < col->nr; i++)
	do things on col.stuff[i];

can react differently to cases where we have an empty collection
and where we do not have any collection to begin with.  

The other side of the coin is that it would make it harder to treat
the lack of collection itself and the collection being empty the
same way.  The above code might need to become

    col = git_get_some_collection(...);
    if (!col)
	col = git_get_empty_collection();
    if (!col->nr)
	git_add_to_some_collection(col, the default item);
    for (i = 0; i < col->nr; i++)
	do things on col.stuff[i];

but if the "get the collection" thing returns an empty collection
when there is actually no collection, we can lose two lines from
there.

I'm all for conveying more information when possible, but how can
the config API provide a distinction between an empty list and a
NULL list? The only thing I can think about is a case where the
empty value clears the list and no new values are added, such as

	[bogus "key"]
		item = one
		item = two
		item =

With this, the key exists in the config file, but the multi-valued
list is empty. Is that an important distinction? I don't think so.

Between these two positions, I am in favor of the flexibility that
we can use NULL to signal the lack of collection, not a presence of
a collection with zero items, as it feels conceptually cleaner.

Counting the hunks in [2/5] and [5/5], it seems that "when no
variable with given key is defined, we return an empty set" makes us
to have more code in 7 places in [PATCH 2/5], while allowing us to
lose code in 10 places in [PATCH 5/5].

Outside of the "if (sl && sl->nr) {" that I forgot to convert into
"if (sl->nr) {" in patch 5, all of those 7 places that have "more code"
end up with only that extra "->nr". The code looks uglier temporarily
in patch 2 to create the compatibility mode so patch 4 can change only
the config API without test failures.

Thanks,
-Stolee

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help