Re: [PATCH v3 2/5] config: read protected config with `git_protected_config()`

From: Glen Choo <hidden>
Date: 2022-05-31 17:43:25

Junio C Hamano [off-list ref] writes:

quoted

Protected config is read using `read_very_early_config()`, which has
several downsides:

- Every call to `read_very_early_config()` parses global and
  system-level config files anew, but this can be optimized by just
  parsing them once [1].
- Protected variables should respect "-c" because we can reasonably
  assume that it comes from the user. But, `read_very_early_config()`
  can't use "-c" because it is called so early that it does not have
  access to command line arguments.

Now we are talking about protected "variable".  Is that a synonym
for "config", or are there some distinctions between them?

Sorry, that's an old term I was toying with (this somehow snuck through
my proofreading). I just meant "variable that is only read from
protected config", aka a "protected config only variable".

A goal in this version was to introduce as little jargon as possible, so
- "protected config" refers to the set of config sources, and
- "protected config only" refers to config variables/settings that are
  only read from protected config.

quoted

- Protected config is stored in `the_repository` so that we don't need
  to statically allocate it. But this might be confusing since protected
  config ignores repository config by definition.

Yes, it indeed is.  Is it because we were over-eager when we
introduced the "struct repository *repo" parameter to many functions
and the configuration system wants you to have some repository, even
when you know you are not reading from any repository?

Ah no, I was just trying to avoid yet-another global variable (since
IIRC we want to move towards a more lib-like Git), and the_repository
was a convenient global variable to (ab)use.

I am wondering if it is a cleaner solution *not* to hang the
protected config as a configset in the_repository, but keep the
configset as a separate global variable, perhaps static to config.c
and is meant to be only accessed via git_protected_config() and the
like.

I think your suggestion to use a global variable is better, as much as I
want to avoid another global variable. Protected config would affect any
repositories that we work with in-core, so using a global sounds ok.

environment.c might be a better place since we already make a concerted
effort to put global config variables there instead of config.c.

As an aside, I wonder how we could get rid of all of the globals in
environment.c in the long term. Maybe we would have yet-another all
encompassing global, the_environment, and then figure out which
variables belong to the repository and which belong to the environment.

quoted

@@ -295,6 +295,11 @@ void repo_clear(struct repository *repo)
 		FREE_AND_NULL(repo->remote_state);
 	}
 
+	if (repo->protected_config) {
+		git_configset_clear(repo->protected_config);
+		FREE_AND_NULL(repo->protected_config);
+	}
+

This becomes necessary only because each repository instance has
protected_config, even though we need only one instance, no matter
how many repositories we are accessing in this single invocation of
Git, no?

Yes.

How should "git config -l" interact with "protected config" and
"protected variables", by the way?  Should a user be able to tell
which ones are coming from protected scope?  Should we gain, next to
--global, --system, etc., --protected option to list only the
protected config/variable?

I'll have to think about this some more. My initial thoughts are that we
should do this if we formalize 'protected' as a scope-like concept, but
I don't see the lack of "--protected" as a significant hindrance to
users because they can use "--global" and "--system" (albeit in two
invocations instead of one).

This is another thing that I find iffy on terminology.  Should a
random variable, like user.name, be a "protected config", if it is
found in $HOME/.gitconfig?  If it comes from there, surely we can
trust its value, but unlike things like safe.directory, there is no
code that wants to enforce that we pay attention only to user.name
that came from trusted scopes.  Should such a variable be called
"protected variable"?

Ah.. I think it would be best to pretend that the "Protected variable"
typo never happened. That term was destined to be confusing and
meaningless.

Instead, we can use "protected config" to refer to the config and
"protected config only" to refer to variables. Since "protected config"
is defined as (global + system + CLI) config, then yes, we would say
that it is "protected config". But since we do not enforce that
"user.name" _must_ come from only protected config, it is not "protected
config only".

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help