Thread (57 messages) 57 messages, 5 authors, 2022-02-20

Re: [PATCH 4/7] sparse-checkout: error or warn when given individual files

From: Elijah Newren <hidden>
Date: 2022-02-15 04:17:56

On Mon, Feb 14, 2022 at 7:56 AM Derrick Stolee [off-list ref] wrote:
On 2/12/2022 7:39 PM, Elijah Newren via GitGitGadget wrote:
quoted
From: Elijah Newren <redacted>

The set and add subcommands accept multiple positional arguments.
The meaning of these arguments differs slightly in the two modes:

Cone mode only accepts directories.  If given a file, it would
previously treat it as a directory, causing not just the file itself to
be included but all sibling files as well -- likely against users'
expectations.  Throw an error if the specified path is a file in the
index.  Provide a --skip-checks argument to allow users to override
(e.g. for the case when the given path IS a directory on another
branch).
I agree that this is likely to be an improvement for users. The
sparse-checkout builtin isn't integrated with the sparse index
yet. At least not integrated upstream: we have commits in microsoft/git
that we plan to send when other things in flight are merged. This
change likely introduces a new opportunity for the index to expand,
so I will keep that in mind when upstreaming.
Actually, I thought about that during development, and my presumption
was that we would not expand the index.  We've survived a few years
without reporting any argument errors to the user and folks seem to
usually get things right, so while I think it adds value to report on
likely errors, I don't think it's important for us to catch and warn
on every potential misuse.  I think the probable errors are the ones
where they specify a <file> that exists in both the working tree and
index.  The remaining ones are less probable, and also possibly quite
expensive to catch.  I'm not sure it's worth the cost to try to report
those.
quoted
Non-cone mode accepts general gitignore patterns.  However, it has an
O(N*M) performance baked into its design, where all N index files must
be matched against all M sparse-checkout patterns with EVERY call to
unpack_trees() that updates the working tree.  As such, it is important
to keep the number of patterns small, and thus we should warn users to
prefer passing directories and more generic glob patterns to get the
paths they want instead of listing each individual file.  (The
--skip-checks argument can also be used to bypass this warning.)  Also,
even when users do want to specify individual files, they will often
want to do so by providing a leading '/' (to avoid selecting the same
filename in all subdirectories), in which case this error message would
never trigger anyway.
I think the case of "I want only one file from this directory" and "I
want files with the given name pattern" are the main reason to still
use non-cone-mode. Users with this need usually have a directory full
of large files, and they choose which of those large files they need
using sparse-checkout. The repository reorganization required to use
cone mode for this use is perhaps too great (or they haven't thought
about doing it). For this reason, I would prefer that we do not do
these checks when not in cone mode.
If they "only want one file from this directory", isn't the correct
way to specify that by mentioning the path with a leading slash?
Otherwise, they'd potentially grab files with similar names from many
directories, right?  So, even in that usecase, we should still error
out if they specify a <filename> rather than /<filename>.  Perhaps my
reasoning should lead with that and I should fix up the warning
message a bit, but I still think we should probably give a warning
even for those who are explicitly wanting the usecase you mention.

Also, note this is a warning and not an error -- and the warning can
be suppressed with --skip-checks.
quoted
+test_expect_success 'by default, cone mode will error out when passed files' '
+     git -C repo sparse-checkout reapply --cone &&
+     test_must_fail git -C repo sparse-checkout add .gitignore 2>error &&
+
+     grep ".gitignore.*is not a directory" error
+'
+
+test_expect_success 'by default, non-cone mode will warn on individual files' '
+     git -C repo sparse-checkout reapply --no-cone &&
+     git -C repo sparse-checkout add .gitignore 2>warning &&
+
+     grep "passing directories or less specific patterns is recommended" warning
+'
So I would expect this second test to have

        test_must_be_empty warning

to show that no warning occurs when specifying a file in non-cone-mode.
or perhaps

grep "please specify a leading slash to select a single file" warning

?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help