Re: [PATCH v2 3/7] subtree: persist cache between split runs
From: Johannes Schindelin <hidden>
Date: 2020-10-07 16:06:35
Hi Tom, On Tue, 6 Oct 2020, Tom Clarkson via GitGitGadget wrote:
quoted hunk ↗ jump to hunk
@@ -48,6 +49,7 @@ annotate= squash= message= prefix= +clearcache=
It might be more consistent to call it `clear_cache` (i.e. with an underscore), just like `ignore_joins`.
quoted hunk ↗ jump to hunk
debug () { if test -n "$debug"@@ -131,6 +133,9 @@ do --no-rejoin) rejoin= ;; + --clear-cache) + clearcache=1 + ;; --ignore-joins) ignore_joins=1 ;;@@ -206,9 +211,13 @@ debug "opts: {$*}" debug cache_setup () { - cachedir="$GIT_DIR/subtree-cache/$$" - rm -rf "$cachedir" || - die "Can't delete old cachedir: $cachedir" + cachedir="$GIT_DIR/subtree-cache/$prefix"
Excellent, the `prefix` should be "unique enough".
quoted hunk ↗ jump to hunk
+ if test -n "$clearcache" + then + debug "Clearing cache" + rm -rf "$cachedir" || + die "Can't delete old cachedir: $cachedir" + fi mkdir -p "$cachedir" || die "Can't create new cachedir: $cachedir" mkdir -p "$cachedir/notree" ||@@ -266,6 +275,16 @@ cache_set () { echo "$newrev" >"$cachedir/$oldrev" } +cache_set_if_unset () { + oldrev="$1" + newrev="$2"
`local`? ;-)
+ if test -e "$cachedir/$oldrev" + then + return + fi + echo "$newrev" >"$cachedir/$oldrev"
So that directory contains commit mappings, a file for each mapped revision. Thinking back to patch 2/11, I am now no longer that sure that it makes sense to fill it up with every commit in that commit range: performance suffers when directories contain too many files. For example, I had a case in the past where it took a minute just to enumerate a directory, and even looking whether a file existed in that directory was not exactly fun. In any case, I would write it slightly shorter: test -e "$cachedir/$oldrev" || echo "$newrev" >"$cachedir/$oldrev"
quoted hunk ↗ jump to hunk
+} + rev_exists () { if git rev-parse "$1" >/dev/null 2>&1 then@@ -375,13 +394,13 @@ find_existing_splits () { then # squash commits refer to a subtree debug " Squash: $sq from $sub" - cache_set "$sq" "$sub" + cache_set_if_unset "$sq" "$sub" fi if test -n "$main" -a -n "$sub" then debug " Prior: $main -> $sub" - cache_set $main $sub - cache_set $sub $sub + cache_set_if_unset $main $sub + cache_set_if_unset $sub $sub try_remove_previous "$main" try_remove_previous "$sub" fi@@ -688,6 +707,8 @@ process_split_commit () { if test -n "$newparents" then cache_set "$rev" "$rev" + else + cache_set "$rev" ""
Was this hunk intended to be snuck in here? I can understand the s/cache_set/cache_set_if_unset/ changes, of course, but not this hunk.
quoted hunk ↗ jump to hunk
fi return fi@@ -785,7 +806,7 @@ cmd_split () { # the 'onto' history is already just the subdir, so # any parent we find there can be used verbatim debug " cache: $rev" - cache_set "$rev" "$rev" + cache_set_if_unset "$rev" "$rev" done fi@@ -798,7 +819,7 @@ cmd_split () { git rev-list --topo-order --skip=1 $mainline | while read rev do - cache_set "$rev" "" + cache_set_if_unset "$rev" ""
Okay. A quite interesting question now would be: are there any callers of `cache_set` left? If so, why? Thanks, Dscho
done || exit $? fi -- gitgitgadget