Re: [PATCH 07/10] submodule: make it a built-in, remove git-submodule.sh
From: Glen Choo <hidden>
Date: 2022-10-20 22:49:48
Ævar Arnfjörð Bjarmason [off-list ref] writes:
Replace the "git-submodule.sh" script with a built-in "builtin/submodule.c. For" now this new command is only a dumb dispatcher that uses run-command.c to invoke "git submodule--helper", just as "git-submodule.sh" used to do. This is obviously not ideal, and we should eventually follow-up and merge the "builtin/submodule--helper.c" code into "builtin/submodule.c". Doing it this way makes it easy to review that this new C implementation isn't doing anything more clever than the old shellscript implementation. This is a large win for performance, we're now more than 4x as fast as before in terms of the fixed cost of invoking any "git submodule" command[1]: $ git hyperfine -L rev HEAD~1,HEAD -s 'make CFLAGS=-O3' './git --exec-path=$PWD submodule foreach "echo \$name"' Benchmark 1: ./git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD~1 Time (mean ± σ): 42.2 ms ± 0.4 ms [User: 34.9 ms, System: 9.1 ms] Range (min … max): 41.3 ms … 43.2 ms 70 runs Benchmark 2: ./git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD Time (mean ± σ): 9.7 ms ± 0.1 ms [User: 7.6 ms, System: 2.2 ms] Range (min … max): 9.5 ms … 10.3 ms 282 runs Summary './git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD' ran 4.33 ± 0.07 times faster than './git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD~1' We're taking pains here to faithfully reproduce existing "git-submodule.sh" behavior, even when that behavior is stupid. Some of it we'll fix in subsequent commits, but let's first faithfully reproduce the behavior. One exception is the change in the behavior of the exit code stand-alone "-h" and "--" yield, see the altered tests. Returning 129 instead of 0 and 1 for "-h" and "--" respectively is a concession to basic sanity.
Sounds reasonable.
quoted hunk ↗ jump to hunk
The pattern of using "define BUILTIN_" macros here isn't needed for now, but as we'll move code out of "builtin/submodule--helper.c" we'll want to re-use these strings. See 8757b35d443 (commit-graph: define common usage with a macro, 2021-08-23) and 1e91d3faf6c (reflog: move "usage" variables and use macros, 2022-03-17) for prior art using this pattern. The "(argc < 2 || !strcmp(argv[1], "-h"))" path at the top of cmd_submodule__helper() could now be a "(argc < 2)" if not for t0012-help.sh (which invokes all built-ins manually with "-h"). Let's leave it for now, eventually we'll consolidate the two. 1. Using the "git hyperfine" wrapper for "hyperfine": https://lore.kernel.org/git/211201.86r1aw9gbd.gmgdl@evledraar.gmail.com/ (local) Signed-off-by: Ævar Arnfjörð Bjarmason <redacted> --- Makefile | 2 +- builtin.h | 1 + builtin/submodule.c | 151 +++++++++++++++++++++++++++++++++++++ git-submodule.sh | 91 ---------------------- git.c | 1 + t/t7400-submodule-basic.sh | 12 ++- 6 files changed, 159 insertions(+), 99 deletions(-) create mode 100644 builtin/submodule.c delete mode 100755 git-submodule.shdiff --git a/Makefile b/Makefile index 6bfb62cbe94..d8e2c02ad42 100644 --- a/Makefile +++ b/Makefile@@ -635,7 +635,6 @@ SCRIPT_SH += git-merge-resolve.sh SCRIPT_SH += git-mergetool.sh SCRIPT_SH += git-quiltimport.sh SCRIPT_SH += git-request-pull.sh -SCRIPT_SH += git-submodule.sh SCRIPT_SH += git-web--browse.sh SCRIPT_LIB += git-mergetool--lib@@ -1235,6 +1234,7 @@ BUILTIN_OBJS += builtin/show-ref.o BUILTIN_OBJS += builtin/sparse-checkout.o BUILTIN_OBJS += builtin/stash.o BUILTIN_OBJS += builtin/stripspace.o +BUILTIN_OBJS += builtin/submodule.o BUILTIN_OBJS += builtin/submodule--helper.o BUILTIN_OBJS += builtin/symbolic-ref.o BUILTIN_OBJS += builtin/tag.odiff --git a/builtin.h b/builtin.h index 8901a34d6bf..475c79b6a5a 100644 --- a/builtin.h +++ b/builtin.h@@ -224,6 +224,7 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix); int cmd_status(int argc, const char **argv, const char *prefix); int cmd_stash(int argc, const char **argv, const char *prefix); int cmd_stripspace(int argc, const char **argv, const char *prefix); +int cmd_submodule(int argc, const char **argv, const char *prefix); int cmd_submodule__helper(int argc, const char **argv, const char *prefix); int cmd_switch(int argc, const char **argv, const char *prefix); int cmd_symbolic_ref(int argc, const char **argv, const char *prefix);diff --git a/builtin/submodule.c b/builtin/submodule.c new file mode 100644 index 00000000000..7e3499f3376 --- /dev/null +++ b/builtin/submodule.c@@ -0,0 +1,151 @@ +/* + * Copyright (c) 2007-2022 Lars Hjemli & others + * Copyright(c) 2022 Ævar Arnfjörð Bjarmason + */ +#include "builtin.h" +#include "parse-options.h" +#include "run-command.h" +#include "strvec.h" + +#define BUILTIN_SUBMODULE_USAGE \ + "git submodule [--quiet] [--cached]" + +#define BUILTIN_SUBMODULE_ADD_USAGE \ + N_("git submodule [--quiet] add [-b <branch>] [-f | --force] [--name <name>]\n" \ + " [--reference <repository>] [--] <repository> [<path>]") + +#define BUILTIN_SUBMODULE_STATUS_USAGE \ + N_("git submodule [--quiet] status [--cached] [--recursive] [--] [<path>...]") + +#define BUILTIN_SUBMODULE_INIT_USAGE \ + N_("git submodule [--quiet] init [--] [<path>...]") + +#define BUILTIN_SUBMODULE_DEINIT_USAGE \ + N_("git submodule [--quiet] deinit [-f | --force] (--all | [--] <path>...)") + +#define BUILTIN_SUBMODULE_UPDATE_USAGE \ + N_("git submodule [--quiet] update [-v] [--init [--filter=<filter-spec>]]\n" \ + " [--remote] [-N | --no-fetch] [-f | --force] [--checkout |--merge | --rebase]\n" \ + " [--[no-]recommend-shallow] [--reference <repository>] [--recursive]\n" \ + " [--[no-]single-branch] [--] [<path>...]") + +#define BUILTIN_SUBMODULE_SET_BRANCH_USAGE \ + N_("git submodule [--quiet] set-branch (--default | --branch <branch>) [--] <path>") + +#define BUILTIN_SUBMODULE_SET_URL_USAGE \ + N_("git submodule [--quiet] set-url [--] <path> <newurl>") + +#define BUILTIN_SUBMODULE_SUMMARY_USAGE \ + N_("git submodule [--quiet] summary [--cached | --files] [--summary-limit <n>]\n" \ + " [commit] [--] [<path>...]") +#define BUILTIN_SUBMODULE_FOREACH_USAGE \ + N_("git submodule [--quiet] foreach [--recursive] <command>") + +#define BUILTIN_SUBMODULE_SYNC_USAGE \ + N_("git submodule [--quiet] sync [--recursive] [--] [<path>...]") + +#define BUILTIN_SUBMODULE_ABSORBGITDIRS_USAGE \ + N_("git submodule [--quiet] absorbgitdirs [--] [<path>...]") +
I was surprised to see that these strings aren't deduped from the ones in builtin/submodule--helper.c, and are, in fact, different. Is there a reason for that?
+static const char * const git_submodule_usage[] = {
+ BUILTIN_SUBMODULE_USAGE,
+ BUILTIN_SUBMODULE_ADD_USAGE,
+ BUILTIN_SUBMODULE_STATUS_USAGE,
+ BUILTIN_SUBMODULE_INIT_USAGE,
+ BUILTIN_SUBMODULE_DEINIT_USAGE,
+ BUILTIN_SUBMODULE_UPDATE_USAGE,
+ BUILTIN_SUBMODULE_SET_BRANCH_USAGE,
+ BUILTIN_SUBMODULE_SET_URL_USAGE,
+ BUILTIN_SUBMODULE_SUMMARY_USAGE,
+ BUILTIN_SUBMODULE_FOREACH_USAGE,
+ BUILTIN_SUBMODULE_SYNC_USAGE,
+ BUILTIN_SUBMODULE_ABSORBGITDIRS_USAGE,
+ NULL,
+};
+
+static void setup_helper_args(int argc, const char **argv, const char *prefix,
+ int quiet, int cached, struct strvec *args,
+ const struct option *options)
+{
+ const char *cmd;
+ int do_quiet_cache = 1;
+ int do_prefix = 1;
+
+ strvec_push(args, "submodule--helper");
+
+ /* No command word defaults to "status" */
+ if (!argc) {
+ strvec_push(args, "status");
+ return;
+ }
We can't return without processing "--cached", e.g. removing the
explicit "status" subcommand like so fails.
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index d8f7d6ee29..5e61cef18e 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -587,7 +587,7 @@ test_expect_success 'status should be "modified" after submodule commit' '
'
test_expect_success 'the --cached sha1 should be rev1' '
- git submodule status --cached >list &&
+ git submodule --cached >list &&
grep "^+$rev1" list
'
+
+ /* Did we get --cached with a command? */
+ if (cached)
+ usage_msg_optf(_("'%s' option is only supported with explicit 'status'"),
+ git_submodule_usage, options, "--cached");
+
+
+ /* Either a valid command, or submodule--helper will barf! */
+ cmd = argv[0];submodule--helper does die, but the help message that it emits is submodule--helper specific, instead of the "git submodule" usage string from before.
+ strvec_push(args, cmd);
+ argv++;
+ argc--;
+
+ /*
+ * This is stupid, but don't support "[--]" to
+ * subcommand-less "git-submodule" for now.
+ */
+ if (!strcmp(cmd, "--") || !strcmp(cmd, "--end-of-options"))
+ usage_msg_optf(_("need explicit sub-command name to delimit with '%s'"),
+ git_submodule_usage, options, cmd);If this is the only "stupid" behavior, maybe just call this out specifically in the commit message.
+
+ /* Options that need to go before user-supplied options */
+ if (!strcmp(cmd, "absorbgitdirs"))
+ do_quiet_cache = 0;
+ else if (!strcmp(cmd, "update"))
+ ;
+ else
+ do_prefix = 0;
+ if (do_quiet_cache) {
+ if (quiet)
+ strvec_push(args, "--quiet");
+ if (cached)
+ strvec_push(args, "--cached");
+
+ if (prefix && do_prefix)
+ strvec_pushl(args, "--prefix", prefix, NULL);
+ }
This looks like a good reason to get rid of "--prefix" from "git
submodule--helper update" like I mentioned upthread.
I didn't notice earlier that absorbgitdirs had the same problem, but
that's an even easier fixup:
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index d11e100301..1f7f142270 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2825,9 +2825,6 @@ static int absorb_git_dirs(int argc, const char **argv, const char *prefix)
struct module_list list = MODULE_LIST_INIT;
unsigned flags = ABSORB_GITDIR_RECURSE_SUBMODULES;
struct option embed_gitdir_options[] = {
- OPT_STRING(0, "prefix", &prefix,
- N_("path"),
- N_("path into the working tree")),
OPT_BIT(0, "--recursive", &flags, N_("recurse into submodules"),
ABSORB_GITDIR_RECURSE_SUBMODULES),
OPT_END()