Re: [PATCH 4/7] sparse-checkout: error or warn when given individual files
From: Derrick Stolee <hidden>
Date: 2022-02-14 15:56:41
On 2/12/2022 7:39 PM, Elijah Newren via GitGitGadget wrote:
From: Elijah Newren <redacted> The set and add subcommands accept multiple positional arguments. The meaning of these arguments differs slightly in the two modes: Cone mode only accepts directories. If given a file, it would previously treat it as a directory, causing not just the file itself to be included but all sibling files as well -- likely against users' expectations. Throw an error if the specified path is a file in the index. Provide a --skip-checks argument to allow users to override (e.g. for the case when the given path IS a directory on another branch).
I agree that this is likely to be an improvement for users. The sparse-checkout builtin isn't integrated with the sparse index yet. At least not integrated upstream: we have commits in microsoft/git that we plan to send when other things in flight are merged. This change likely introduces a new opportunity for the index to expand, so I will keep that in mind when upstreaming.
Non-cone mode accepts general gitignore patterns. However, it has an O(N*M) performance baked into its design, where all N index files must be matched against all M sparse-checkout patterns with EVERY call to unpack_trees() that updates the working tree. As such, it is important to keep the number of patterns small, and thus we should warn users to prefer passing directories and more generic glob patterns to get the paths they want instead of listing each individual file. (The --skip-checks argument can also be used to bypass this warning.) Also, even when users do want to specify individual files, they will often want to do so by providing a leading '/' (to avoid selecting the same filename in all subdirectories), in which case this error message would never trigger anyway.
I think the case of "I want only one file from this directory" and "I want files with the given name pattern" are the main reason to still use non-cone-mode. Users with this need usually have a directory full of large files, and they choose which of those large files they need using sparse-checkout. The repository reorganization required to use cone mode for this use is perhaps too great (or they haven't thought about doing it). For this reason, I would prefer that we do not do these checks when not in cone mode.
+test_expect_success 'by default, cone mode will error out when passed files' ' + git -C repo sparse-checkout reapply --cone && + test_must_fail git -C repo sparse-checkout add .gitignore 2>error && + + grep ".gitignore.*is not a directory" error +' + +test_expect_success 'by default, non-cone mode will warn on individual files' ' + git -C repo sparse-checkout reapply --no-cone && + git -C repo sparse-checkout add .gitignore 2>warning && + + grep "passing directories or less specific patterns is recommended" warning +'
So I would expect this second test to have test_must_be_empty warning to show that no warning occurs when specifying a file in non-cone-mode. Thanks, -Stolee