Thread (18 messages) 18 messages, 2 authors, 2022-10-20

Re: [PATCH 07/10] submodule: make it a built-in, remove git-submodule.sh

From: Glen Choo <hidden>
Date: 2022-10-20 22:49:48

Ævar Arnfjörð Bjarmason [off-list ref] writes:
Replace the "git-submodule.sh" script with a built-in
"builtin/submodule.c. For" now this new command is only a dumb
dispatcher that uses run-command.c to invoke "git submodule--helper",
just as "git-submodule.sh" used to do.

This is obviously not ideal, and we should eventually follow-up and
merge the "builtin/submodule--helper.c" code into
"builtin/submodule.c". Doing it this way makes it easy to review that
this new C implementation isn't doing anything more clever than the
old shellscript implementation.

This is a large win for performance, we're now more than 4x as fast as
before in terms of the fixed cost of invoking any "git submodule"
command[1]:

	$ git hyperfine -L rev HEAD~1,HEAD -s 'make CFLAGS=-O3' './git --exec-path=$PWD submodule foreach "echo \$name"'
	Benchmark 1: ./git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD~1
	  Time (mean ± σ):      42.2 ms ±   0.4 ms    [User: 34.9 ms, System: 9.1 ms]
	  Range (min … max):    41.3 ms …  43.2 ms    70 runs

	Benchmark 2: ./git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD
	  Time (mean ± σ):       9.7 ms ±   0.1 ms    [User: 7.6 ms, System: 2.2 ms]
	  Range (min … max):     9.5 ms …  10.3 ms    282 runs

	Summary
	  './git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD' ran
	    4.33 ± 0.07 times faster than './git --exec-path=$PWD submodule foreach "echo \$name"' in 'HEAD~1'

We're taking pains here to faithfully reproduce existing
"git-submodule.sh" behavior, even when that behavior is stupid. Some
of it we'll fix in subsequent commits, but let's first faithfully
reproduce the behavior.

One exception is the change in the behavior of the exit code
stand-alone "-h" and "--" yield, see the altered tests. Returning 129
instead of 0 and 1 for "-h" and "--" respectively is a concession to
basic sanity.
Sounds reasonable.
quoted hunk ↗ jump to hunk
The pattern of using "define BUILTIN_" macros here isn't needed for
now, but as we'll move code out of "builtin/submodule--helper.c" we'll
want to re-use these strings. See 8757b35d443 (commit-graph: define
common usage with a macro, 2021-08-23) and 1e91d3faf6c (reflog: move
"usage" variables and use macros, 2022-03-17) for prior art using this
pattern.

The "(argc < 2 || !strcmp(argv[1], "-h"))" path at the top of
cmd_submodule__helper() could now be a "(argc < 2)" if not for
t0012-help.sh (which invokes all built-ins manually with "-h"). Let's
leave it for now, eventually we'll consolidate the two.

1. Using the "git hyperfine" wrapper for "hyperfine":
   https://lore.kernel.org/git/211201.86r1aw9gbd.gmgdl@evledraar.gmail.com/ (local)

Signed-off-by: Ævar Arnfjörð Bjarmason <redacted>
---
 Makefile                   |   2 +-
 builtin.h                  |   1 +
 builtin/submodule.c        | 151 +++++++++++++++++++++++++++++++++++++
 git-submodule.sh           |  91 ----------------------
 git.c                      |   1 +
 t/t7400-submodule-basic.sh |  12 ++-
 6 files changed, 159 insertions(+), 99 deletions(-)
 create mode 100644 builtin/submodule.c
 delete mode 100755 git-submodule.sh
diff --git a/Makefile b/Makefile
index 6bfb62cbe94..d8e2c02ad42 100644
--- a/Makefile
+++ b/Makefile
@@ -635,7 +635,6 @@ SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
-SCRIPT_SH += git-submodule.sh
 SCRIPT_SH += git-web--browse.sh
 
 SCRIPT_LIB += git-mergetool--lib
@@ -1235,6 +1234,7 @@ BUILTIN_OBJS += builtin/show-ref.o
 BUILTIN_OBJS += builtin/sparse-checkout.o
 BUILTIN_OBJS += builtin/stash.o
 BUILTIN_OBJS += builtin/stripspace.o
+BUILTIN_OBJS += builtin/submodule.o
 BUILTIN_OBJS += builtin/submodule--helper.o
 BUILTIN_OBJS += builtin/symbolic-ref.o
 BUILTIN_OBJS += builtin/tag.o
diff --git a/builtin.h b/builtin.h
index 8901a34d6bf..475c79b6a5a 100644
--- a/builtin.h
+++ b/builtin.h
@@ -224,6 +224,7 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix);
 int cmd_status(int argc, const char **argv, const char *prefix);
 int cmd_stash(int argc, const char **argv, const char *prefix);
 int cmd_stripspace(int argc, const char **argv, const char *prefix);
+int cmd_submodule(int argc, const char **argv, const char *prefix);
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix);
 int cmd_switch(int argc, const char **argv, const char *prefix);
 int cmd_symbolic_ref(int argc, const char **argv, const char *prefix);
diff --git a/builtin/submodule.c b/builtin/submodule.c
new file mode 100644
index 00000000000..7e3499f3376
--- /dev/null
+++ b/builtin/submodule.c
@@ -0,0 +1,151 @@
+/*
+ * Copyright (c) 2007-2022 Lars Hjemli & others
+ * Copyright(c) 2022 Ævar Arnfjörð Bjarmason
+ */
+#include "builtin.h"
+#include "parse-options.h"
+#include "run-command.h"
+#include "strvec.h"
+
+#define BUILTIN_SUBMODULE_USAGE \
+	"git submodule [--quiet] [--cached]"
+
+#define BUILTIN_SUBMODULE_ADD_USAGE \
+	N_("git submodule [--quiet] add [-b <branch>] [-f | --force] [--name <name>]\n" \
+	   "              [--reference <repository>] [--] <repository> [<path>]")
+
+#define BUILTIN_SUBMODULE_STATUS_USAGE \
+	N_("git submodule [--quiet] status [--cached] [--recursive] [--] [<path>...]")
+
+#define BUILTIN_SUBMODULE_INIT_USAGE \
+	N_("git submodule [--quiet] init [--] [<path>...]")
+
+#define BUILTIN_SUBMODULE_DEINIT_USAGE \
+	N_("git submodule [--quiet] deinit [-f | --force] (--all | [--] <path>...)")
+
+#define BUILTIN_SUBMODULE_UPDATE_USAGE \
+	N_("git submodule [--quiet] update [-v] [--init [--filter=<filter-spec>]]\n" \
+	   "              [--remote] [-N | --no-fetch] [-f | --force] [--checkout |--merge | --rebase]\n" \
+	   "              [--[no-]recommend-shallow] [--reference <repository>] [--recursive]\n" \
+	   "              [--[no-]single-branch] [--] [<path>...]")
+
+#define BUILTIN_SUBMODULE_SET_BRANCH_USAGE \
+	N_("git submodule [--quiet] set-branch (--default | --branch <branch>) [--] <path>")
+
+#define BUILTIN_SUBMODULE_SET_URL_USAGE \
+	N_("git submodule [--quiet] set-url [--] <path> <newurl>")
+
+#define BUILTIN_SUBMODULE_SUMMARY_USAGE \
+	N_("git submodule [--quiet] summary [--cached | --files] [--summary-limit <n>]\n"  \
+	   "              [commit] [--] [<path>...]")
+#define BUILTIN_SUBMODULE_FOREACH_USAGE \
+	N_("git submodule [--quiet] foreach [--recursive] <command>")
+
+#define BUILTIN_SUBMODULE_SYNC_USAGE \
+	N_("git submodule [--quiet] sync [--recursive] [--] [<path>...]")
+
+#define BUILTIN_SUBMODULE_ABSORBGITDIRS_USAGE \
+	N_("git submodule [--quiet] absorbgitdirs [--] [<path>...]")
+
I was surprised to see that these strings aren't deduped from the ones
in builtin/submodule--helper.c, and are, in fact, different. Is there a
reason for that?
+static const char * const git_submodule_usage[] = {
+	BUILTIN_SUBMODULE_USAGE,
+	BUILTIN_SUBMODULE_ADD_USAGE,
+	BUILTIN_SUBMODULE_STATUS_USAGE,
+	BUILTIN_SUBMODULE_INIT_USAGE,
+	BUILTIN_SUBMODULE_DEINIT_USAGE,
+	BUILTIN_SUBMODULE_UPDATE_USAGE,
+	BUILTIN_SUBMODULE_SET_BRANCH_USAGE,
+	BUILTIN_SUBMODULE_SET_URL_USAGE,
+	BUILTIN_SUBMODULE_SUMMARY_USAGE,
+	BUILTIN_SUBMODULE_FOREACH_USAGE,
+	BUILTIN_SUBMODULE_SYNC_USAGE,
+	BUILTIN_SUBMODULE_ABSORBGITDIRS_USAGE,
+	NULL,
+};
+
+static void setup_helper_args(int argc, const char **argv, const char *prefix,
+			      int quiet, int cached, struct strvec *args,
+			      const struct option *options)
+{
+	const char *cmd;
+	int do_quiet_cache = 1;
+	int do_prefix = 1;
+
+	strvec_push(args, "submodule--helper");
+
+	/* No command word defaults to "status" */
+	if (!argc) {
+		strvec_push(args, "status");
+		return;
+	}
We can't return without processing "--cached", e.g. removing the
explicit "status" subcommand like so fails.

  diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
  index d8f7d6ee29..5e61cef18e 100755
  --- a/t/t7400-submodule-basic.sh
  +++ b/t/t7400-submodule-basic.sh
  @@ -587,7 +587,7 @@ test_expect_success 'status should be "modified" after submodule commit' '
  '

  test_expect_success 'the --cached sha1 should be rev1' '
  -	git submodule status --cached >list &&
  +	git submodule --cached >list &&
    grep "^+$rev1" list
  '
+
+	/* Did we get --cached with a command? */
+	if (cached)
+		usage_msg_optf(_("'%s' option is only supported with explicit 'status'"),
+			       git_submodule_usage, options, "--cached");
+
+
+	/* Either a valid command, or submodule--helper will barf! */
+	cmd = argv[0];
submodule--helper does die, but the help message that it emits is
submodule--helper specific, instead of the "git submodule" usage string
from before.
+	strvec_push(args, cmd);
+	argv++;
+	argc--;
+
+	/*
+	  * This is stupid, but don't support "[--]" to
+	 * subcommand-less "git-submodule" for now.
+	 */
+	if (!strcmp(cmd, "--") || !strcmp(cmd, "--end-of-options"))
+		usage_msg_optf(_("need explicit sub-command name to delimit with '%s'"),
+			       git_submodule_usage, options, cmd);
If this is the only "stupid" behavior, maybe just call this out
specifically in the commit message.
+
+	/* Options that need to go before user-supplied options */
+	if (!strcmp(cmd, "absorbgitdirs"))
+		do_quiet_cache = 0;
+	else if (!strcmp(cmd, "update"))
+		;
+	else
+		do_prefix = 0;
+	if (do_quiet_cache) {
+		if (quiet)
+			strvec_push(args, "--quiet");
+		if (cached)
+			strvec_push(args, "--cached");
+
+		if (prefix && do_prefix)
+			strvec_pushl(args, "--prefix", prefix, NULL);
+	}
This looks like a good reason to get rid of "--prefix" from "git
submodule--helper update" like I mentioned upthread.

I didn't notice earlier that absorbgitdirs had the same problem, but
that's an even easier fixup:

  diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
  index d11e100301..1f7f142270 100644
  --- a/builtin/submodule--helper.c
  +++ b/builtin/submodule--helper.c
  @@ -2825,9 +2825,6 @@ static int absorb_git_dirs(int argc, const char **argv, const char *prefix)
    struct module_list list = MODULE_LIST_INIT;
    unsigned flags = ABSORB_GITDIR_RECURSE_SUBMODULES;
    struct option embed_gitdir_options[] = {
  -		OPT_STRING(0, "prefix", &prefix,
  -			   N_("path"),
  -			   N_("path into the working tree")),
      OPT_BIT(0, "--recursive", &flags, N_("recurse into submodules"),
        ABSORB_GITDIR_RECURSE_SUBMODULES),
      OPT_END()
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help