Re: [PATCH v5 3/7] submodule: always validate gitdirs inside submodule_name_to_gitdir
From: Adrian Ratiu <hidden>
Date: 2025-12-05 18:17:40
On Fri, 05 Dec 2025, Patrick Steinhardt [off-list ref] wrote:
On Wed, Nov 19, 2025 at 11:10:26PM +0200, Adrian Ratiu wrote:quoted
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 2873b2780e..9914ca0786 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c@@ -1780,23 +1776,6 @@ static int clone_submodule(const struct module_clone_data *clone_data, free(path); } - /* - * We already performed this check at the beginning of this function, - * before cloning the objects. This tries to detect racy behavior e.g. - * in parallel clones, where another process could easily have made the - * gitdir nested _after_ it was created. - * - * To prevent further harm coming from this unintentionally-nested - * gitdir, let's disable it by deleting the `HEAD` file. - */ - if (validate_submodule_git_dir(sm_gitdir, clone_data->name) < 0) { - char *head = xstrfmt("%s/HEAD", sm_gitdir); - unlink(head); - free(head); - die(_("refusing to create/use '%s' in another submodule's " - "git dir"), sm_gitdir); - } - connect_work_tree_and_git_dir(clone_data_path, sm_gitdir, 0); p = repo_submodule_path(the_repository, clone_data_path, "config");Hm. This one is a bit puzzling to me. This seems to explicitly be a check about a TOCTOU-style race, where a concurrent process might have created the parent repository after our initial validation of the path. We don't call `submodule_name_to_gitdir()` inbetween those two calls though, so why is this not a concern anymore with the unified API?
Excellent catch. It still is a concern and this specific check needs to be added back. My aim with this patch is to: 1. Avoid redundant validation calls for submodule_name_to_gitdir(). 2. Avoid the risk of callers forgetting to validate. 3. Ensure gitdir paths provided by users via configs are always valid, so I added the validation directly into submodule_name_to_gitdir(). Nothing prevents us from running the validation as many times as we need and, in this case, it's perfectly justified to run it twice. I'll add it back in v6.
quoted
diff --git a/submodule.c b/submodule.c index 35c55155f7..8ef028f26b 100644 --- a/submodule.c +++ b/submodule.c@@ -2153,30 +2153,11 @@ int submodule_move_head(const char *path, const char *super_prefix, if (!(flags & SUBMODULE_MOVE_HEAD_DRY_RUN)) { if (old_head) { - if (!submodule_uses_gitfile(path)) - absorb_git_dir_into_superproject(path, - super_prefix); - else { - char *dotgit = xstrfmt("%s/.git", path); - char *git_dir = xstrdup(read_gitfile(dotgit)); - - free(dotgit); - if (validate_submodule_git_dir(git_dir, - sub->name) < 0) - die(_("refusing to create/use '%s' in " - "another submodule's git dir"), - git_dir); - free(git_dir); - } + absorb_git_dir_into_superproject(path, super_prefix); } else { struct strbuf gitdir = STRBUF_INIT; submodule_name_to_gitdir(&gitdir, the_repository, sub->name); - if (validate_submodule_git_dir(gitdir.buf, - sub->name) < 0) - die(_("refusing to create/use '%s' in another " - "submodule's git dir"), - gitdir.buf); connect_work_tree_and_git_dir(path, gitdir.buf, 0); strbuf_release(&gitdir);The second case here makes sense to me, as we do call `submodule_name_to_gitdir()`. But in the first branch of the condition we retrieve the path directly, so we're not guarded by the validation anymore, are we?
The !submodule_uses_gitfile(path) case never validated and relied on the validation done by absorb_git_dir_into_superproject() which calls both validate_submodule_path() and submodule_name_to_gitdir() and has additional checks. So in essence the validation moved to absorb_git_dir_into_superproject(). If you prefer I can try to refactor absorb_git_dir_into_superproject() a bit to make it clearer, otherwise I can just add back the validation here as well... we can validate as many times as we like. :)