Thread (80 messages) 80 messages, 6 authors, 2020-09-21

Re: [PATCH 01/11] maintenance: create basic maintenance runner

From: Jonathan Nieder <hidden>
Date: 2020-08-12 21:03:33

Derrick Stolee wrote:
Helped-by: Jonathan Nieder [off-list ref]
Signed-off-by: Derrick Stolee <redacted>
---
 .gitignore                        |  1 +
 Documentation/git-maintenance.txt | 57 +++++++++++++++++++++++++++++++
 builtin.h                         |  1 +
 builtin/gc.c                      | 57 +++++++++++++++++++++++++++++++
 git.c                             |  1 +
 t/t7900-maintenance.sh            | 19 +++++++++++
 t/test-lib-functions.sh           | 33 ++++++++++++++++++
 7 files changed, 169 insertions(+)
 create mode 100644 Documentation/git-maintenance.txt
 create mode 100755 t/t7900-maintenance.sh
Looks reasonable.

[...]
quoted hunk ↗ jump to hunk
--- /dev/null
+++ b/Documentation/git-maintenance.txt
@@ -0,0 +1,57 @@
+git-maintenance(1)
+==================
+
+NAME
+----
+git-maintenance - Run tasks to optimize Git repository data
+
+
+SYNOPSIS
+--------
+[verse]
+'git maintenance' run [<options>]
+
+
+DESCRIPTION
+-----------
+Run tasks to optimize Git repository data, speeding up other Git commands
+and reducing storage requirements for the repository.
++
+Git commands that add repository data, such as `git add` or `git fetch`,
+are optimized for a responsive user experience. These commands do not take
+time to optimize the Git data, since such optimizations scale with the full
+size of the repository while these user commands each perform a relatively
+small action.
++
+The `git maintenance` command provides flexibility for how to optimize the
+Git repository.
+
+SUBCOMMANDS
+-----------
+
+run::
+	Run one or more maintenance tasks.
This is still confusing --- shouldn't it say something like "Run the
`gc` maintenance task (see below)"?

[...]
+TASKS
+-----
+
+gc::
+	Cleanup unnecessary files and optimize the local repository. "GC"
nit: cleanup is the noun, "clean up" is the verb
+	stands for "garbage collection," but this task performs many
+	smaller tasks. This task can be rather expensive for large
nit: the word "rather" is not doing much work here, so we could leave
it out
+	repositories, as it repacks all Git objects into a single pack-file.
+	It can also be disruptive in some situations, as it deletes stale
+	data.
What stale data is this referring to?  Where can I read more about
what disruption it causes (or in other words, as a user, how would I
decide when *not* to run this command)?

[...]
+OPTIONS
+-------
+--auto::
+	When combined with the `run` subcommand, run maintenance tasks
+	only if certain thresholds are met. For example, the `gc` task
+	runs when the number of loose objects exceeds the number stored
+	in the `gc.auto` config setting, or when the number of pack-files
+	exceeds the `gc.autoPackLimit` config setting.
Hm, today I learned.  I had thought that --auto doesn't only affect
thresholds but also would affect how aggressive the gc is when
triggered, but the git-gc(1) manpage agrees with what's said above.

Is there a way we can share information between the two to avoid one
falling out of date?  For example, should git-gc.txt point to this
page for the authoritative description?

[...]
quoted hunk ↗ jump to hunk
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -699,3 +699,60 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 
 	return 0;
 }
+
+static const char * const builtin_maintenance_usage[] = {
+	N_("git maintenance run [<options>]"),
not about this patch: I wish we could automatically generate a usage
string in this simple kind of case, to decrease the burden on
translators.

[...]
+static int maintenance_task_gc(struct maintenance_opts *opts)
+{
+	struct child_process child = CHILD_PROCESS_INIT;
+
+	child.git_cmd = 1;
+	strvec_push(&child.args, "gc");
+
+	if (opts->auto_flag)
+		strvec_push(&child.args, "--auto");
+
+	close_object_store(the_repository->objects);
Interesting --- what does this function call do?
+	return run_command(&child);
+}
+
+static int maintenance_run(struct maintenance_opts *opts)
+{
+	return maintenance_task_gc(opts);
+}
+
+int cmd_maintenance(int argc, const char **argv, const char *prefix)
+{
+	static struct maintenance_opts opts;
+	static struct option builtin_maintenance_options[] = {
+		OPT_BOOL(0, "auto", &opts.auto_flag,
+			 N_("run tasks based on the state of the repository")),
+		OPT_END()
+	};
optional: these could be local instead of static
+
+	memset(&opts, 0, sizeof(opts));
not needed if it's static.  If it's not static, could use `opts = {0}`
where it's declared to do the same thing in a simpler way.
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(builtin_maintenance_usage,
+				   builtin_maintenance_options);
+
+	argc = parse_options(argc, argv, prefix,
+			     builtin_maintenance_options,
+			     builtin_maintenance_usage,
+			     PARSE_OPT_KEEP_UNKNOWN);
+
+	if (argc == 1) {
+		if (!strcmp(argv[0], "run"))
+			return maintenance_run(&opts);
+	}
+
+	usage_with_options(builtin_maintenance_usage,
+			   builtin_maintenance_options);
optional: could reverse this test to get the uninteresting case out of
the way first:

	if (argc != 1)
		usage ...

	...

That would also allow making the usage string when argv[0] is not
"run" more specific.

[...]
quoted hunk ↗ jump to hunk
--- /dev/null
+++ b/t/t7900-maintenance.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+test_description='git maintenance builtin'
+
+. ./test-lib.sh
+
+test_expect_success 'help text' '
+	test_expect_code 129 git maintenance -h 2>err &&
+	test_i18ngrep "usage: git maintenance run" err
+'
+
+test_expect_success 'run [--auto]' '
+	GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" git maintenance run &&
+	GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" git maintenance run --auto &&
+	test_subcommand git gc <run-no-auto.txt &&
+	test_subcommand git gc --auto <run-auto.txt
Nice.

[...]
quoted hunk ↗ jump to hunk
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1561,3 +1561,36 @@ test_path_is_hidden () {
 	case "$("$SYSTEMROOT"/system32/attrib "$1")" in *H*?:*) return 0;; esac
 	return 1
 }
+
+# Check that the given command was invoked as part of the
+# trace2-format trace on stdin.
+#
+#	test_subcommand [!] <command> <args>... < <trace>
+#
+# For example, to look for an invocation of "git upload-pack
+# /path/to/repo"
+#
+#	GIT_TRACE2_EVENT=event.log git fetch ... &&
+#	test_subcommand git upload-pack "$PATH" <event.log
+#
+# If the first parameter passed is !, this instead checks that
+# the given command was not called.
+#
+test_subcommand () {
+	local negate=
+	if test "$1" = "!"
+	then
+		negate=t
+		shift
+	fi
+
+	local expr=$(printf '"%s",' "$@")
+	expr="${expr%,}"
+
+	if test -n "$negate"
+	then
+		! grep "\[$expr\]"
+	else
+		grep "\[$expr\]"
+	fi
+}
With whatever subset of the changes described above makes sense,

Reviewed-by: Jonathan Nieder <redacted>

Thanks for your patient work.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help