Thread (11 messages) 11 messages, 2 authors, 2025-09-10

Re: [PATCH] contrib/subtree: fix split with squashed subtrees

From: Phillip Wood <hidden>
Date: 2025-09-02 13:22:20

On 01/09/2025 21:43, Colin Stagner wrote:
On 9/1/25 08:54, Phillip Wood wrote:
Colin Stagner [off-list ref] writes:
quoted
quoted
-    if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
+    if test -n "$(git log -1 --grep="git-subtree-dir:" "$rev^!")"
We could drop the "-1" as we're only considering a single commit.
Concur.
quoted
quoted
-        if test -z "$(git log -1 --grep="git-subtree-mainline:" 
$rev)" &&
-            test -z "$(git log -1 --grep="git-subtree-dir: 
$arg_prefix$" $rev)"
+        if test -z "$(git log -1 --grep="git-subtree-mainline:" 
"$rev^!")" &&
+            test -z "$(git log -1 --grep="git-subtree-dir: 
$arg_prefix$" "$rev^!")"
I'm less sure about this change. Is the second test checking
making sure we don't prune this commit if it has an ancestor
that is a subtree merge for the subtree we're interested in?
The outer loop in git-subtree.sh:983 appears to iterate from the root 
commit forwards… and not from the HEAD backwards.

     git rev-list --topo-order --reverse --parents $rev $unrevs
     #                         ^^^^^^^^^

Since the iteration is ancestor-first, I'm having difficulty seeing why 
`should_ignore_subtree_split_commit()` would want to do an ancestor 
traversal at all. It already sees the commits ancestor-first. But there 
could be a reason that I don't know.
Ah, I was only looking at this patch, not how it was called. That begs 
the question "what's the point of these checks if we've already visited 
all the ancestors anyway". I think the answer is that it is pruning the 
recursion that happens in check_parent() and checking the commits that 
come from that rev-list command is pointless. The regression test 
introduced with this function only looks at $extracount which comes from 
the recursion. I haven't looked too closely but it would be nice if we 
could move this check so it is only run when check_parents() is recursing.
Here is a more long-winded breakdown of these tests. From what I can 
determine:

     test -z "$(git log -1 --grep="git-subtree-mainline:" "$rev")

excludes squashed commits created from

     git subtree merge --prefix subM --squash srcBranch
 > > The --squash creates two commits:
1. A single-parent "Squashed 'subM/' content from", which
    squashes the changes from srcBranch. This commit's tree
    is like the one on srcBranch. It does not have the `subM/`
    prefix.
Yes, and the commit message comes from squash_msg() which adds the 
"git-subtree-dir:" and "git-subtree-split:" trailers but not 
"git-subtree-mainline:".
2. A merge commit which rewrites the tree in (1) to add
    the `subM/` leading directory, then merge it with the
    current branch. The merge commit doesn't have any
    `git-subtree:` trailers.
Indeed. Running

	git subtree split --squash --rejoin --prefix subM

will create a squash commit as above and a merge with a commit message 
created by rejoin_msg() and contains the "git-subtree-dir:", 
"git-subtree-split:" and "git-subtree-mainline:" trailers.
We must exclude (1) since the trees aren't actually compatible with 
HEAD. (They don't have the `subM` prefix). We must keep (2). The above 
`test -z` appears to do this.
So we'll exclude squashed merges but not squashed splits? I think you're 
right that we don't want "git log" to walk the history here.
I am *much* less certain about the second test:

    test -z "$(git log -1 \
               --grep="git-subtree-dir: $arg_prefix$" $rev)"

I think this was intended to keep the mainline portion from a previous 
`git subtree split --rejoin`. But if I remove this `test -z`, all the 
unit tests still pass—including mine. There may not be any test coverage 
for this line. I will probably omit this `test` from v2.
I'm not very familiar with git-subtree but I thought this was ensuring 
that we did not exclude the ancestors of a squash or split that involves 
the subtree that we're interested in. I wouldn't be surprised if the 
test coverage was lacking.
quoted
It would be very helpful if Zach could comment on what was intended here.
Yes, this would aid my understanding a lot.
and mine too.
quoted
If it turns out that all three tests only want to consider a single 
commit then it would be be more efficient to run a single git command 
and check the output with something like

     git show -s --format='%(trailers:key=git-subtree-dir,key=git- 
subtree-mainline' $rev | while read trailer
         do
             # check trailers here using case "$trailer"
         done
This is a cleaner approach, and I'll explore it for v2. Any objection to 
long options like `--no-patch` instead of `-s`? I find these are better 
for scripts since there's less hunting around in man pages.
Sure, I used '-s' out of habit but '--no-patch' would be clearer 
(especially as I can't see any link between the short and long option 
names in this case)

Thanks

Phillip
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help