[PATCH v2 0/7] subtree: Fix handling of complex history
From: Tom Clarkson via GitGitGadget <hidden>
Date: 2020-10-06 22:05:18
Fixes several issues that could occur when running subtree split on large
repos with more complex history.
1. A merge commit could bypass the known start point of the subtree, which
would cause the entire history to be processed recursively, leading to a
stack overflow / segfault after reading a few hundred commits. Older
commits are now explicitly recorded as irrelevant so that the recursive
process can terminate on any mainline commit rather than only on subtree
joins and initial commits.
2. It is possible for a repo to contain subtrees that lack the metadata
that is usually present in add/join commit messages (git-svn at least
can produce such a structure). The new use/ignore/map commands allow the
user to provide that information for any problematic commits.
3. A mainline commit that does not contain the subtree folder could be
erroneously identified as a subtree commit, which would add the entire
mainline history to the subtree. Commits will now only be used as is if
all their parents are already identified as subtree commits. While the
new code can still be tripped up by unusual folder structures, the
completely unambiguous solution turned out to involve a significant
performance penalty, and the new ignore / use commands provide a
workaround for that scenario.
Tom Clarkson (7):
subtree: handle multiple parents passed to cache_miss
subtree: exclude commits predating add from recursive processing
subtree: persist cache between split runs
subtree: add git subtree map command
subtree: add git subtree use and ignore commands
subtree: more robustly distinguish subtree and mainline commits
subtree: document new subtree commands
contrib/subtree/git-subtree.sh | 183 ++++++++++++++++++++++++++------
contrib/subtree/git-subtree.txt | 24 +++++
2 files changed, 175 insertions(+), 32 deletions(-)
base-commit: 47ae905ffb98cc4d4fd90083da6bc8dab55d9ecc
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-493%2Ftqc%2Ftqc%2Fsubtree-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-493/tqc/tqc/subtree-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/493
Range-diff vs v1:
1: 74fa670490 = 1: 9cff2a0cf6 subtree: handle multiple parents passed to cache_miss
2: 87af5a316a ! 2: 79b5f4a651 subtree: exclude commits predating add from recursive processing
@@ contrib/subtree/git-subtree.sh: find_existing_splits () {
+ debug "Looking for first split..."
+ dir="$1"
+ revs="$2"
-+ main=
-+ sub=
-+ local grep_format="^git-subtree-dir: $dir/*\$"
-+ git log --reverse --grep="$grep_format" \
++
++ git log --reverse --grep="^git-subtree-dir: $dir/*\$" \
+ --no-show-signature --pretty=format:'START %H%n%s%n%n%b%nEND%n' $revs |
+ while read a b junk
+ do
3: c892ee9828 = 3: 8eec18388c subtree: persist cache between split runs
4: a67c256a59 = 4: 1490ce1114 subtree: add git subtree map command
5: a76a49651b = 5: 2d103292ce subtree: add git subtree use and ignore commands
6: 27a43ea2c4 = 6: a7aaedfed3 subtree: more robustly distinguish subtree and mainline commits
7: 19db9cfb68 = 7: fe2e4819b8 subtree: document new subtree commands
--
gitgitgadget