Thread (8 messages) 8 messages, 3 authors, 2019-11-25

Re: [PATCH] t3429: try to protect against a potential racy todo file problem

From: SZEDER Gábor <hidden>
Date: 2019-11-25 03:10:12
Subsystem: the rest · Maintainer: Linus Torvalds

Possibly related (same subject, not in this thread)

On Sun, Nov 24, 2019 at 10:10:21PM +0100, SZEDER Gábor wrote:
To notice a changed todo file the sequencer stores the file's stat
data in its 'struct todo_list' instance, and compares it with the
file's current stat data after 'reword', 'squash' and 'exec'
instructions.  If the two stat data doesn't match, it re-reads the
todo file.

Sounds simple, but there are some subtleties going on here:

  - The 'struct todo_list' holds the stat data from the time when the
    todo file was last read.

  - This stat data in 'struct todo_list' is not updated when the
    sequencer itself writes the todo file.

  - Before executing each instruction during an interactive rebase,
    the sequencer always updates the todo file by removing the
    just-about-to-be-executed instruction.  This changes the file's
    size and inode [1].

Consequently, when the sequencer looks at the stat data after a
'reword', 'squash' or 'exec' instruction, it most likely finds that
they differ, even when the user didn't modify the todo list at all!
This is not an issue in practice, it just wastes a few cycles on
re-reading the todo list that matches what the sequencer already has
in memory anyway.

However, an unsuspecting Git developer might try to "fix" it by simply
updating the stat data each time the sequencer writes the todo list
for an interactive rebase.  On first sight it looks quite sensible and
straightforward, but we have to be very careful when doing that,
because potential racy problems lie that way.

It is possible to overwrite the todo list file without modifying
either its inode or size, and if such an overwrite were to happen in
the same second when the file was last read (our stat data has one
second granularity by default), then the actual stat data on the file
system would match the stat data that the sequencer has in memory.
Consequently, such a modification to the todo list file would go
unnoticed.
[1] The todo file is written using the lockfile API, i.e. first to the
    lockfile, which is then moved to place, so the new file can't
    possibly have the same inode as the file it replaces.  Note,
    however, that the file system might reuse inodes, so it is
    possible that the new todo file ends up with the same inode as is
    recorded in the 'struct todo_list' from the last time the file was
    read.
Unfortunately, we already do have an issue here when the sequencer can
overlook a modified todo list, but triggering it needs the lucky
"alignment" of inodes as well.  I can trigger it fairly reliably with
the test below, but forcing the inode match makes it kind of gross and
Linux-only.

  https://travis-ci.org/szeder/git/jobs/616460522#L1470


  ---  >8  ---
diff --git a/t/t9999-rebase-racy-todo-reread.sh b/t/t9999-rebase-racy-todo-reread.sh
new file mode 100755
index 0000000000..437ebd55e0
--- /dev/null
+++ b/t/t9999-rebase-racy-todo-reread.sh
@@ -0,0 +1,87 @@
+#!/bin/sh
+
+test_description='racy edit todo reread problem'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit first_ &&
+	test_commit second &&
+	test_commit third_ &&
+	test_commit fourth &&
+	test_commit fifth_ &&
+	test_commit sixth_ &&
+
+	write_script sequence-editor <<-\EOS &&
+		todo=.git/rebase-merge/git-rebase-todo &&
+		cat >"$todo" <<-EOF
+			reword $(git rev-parse second^0) second
+			reword $(git rev-parse third_^0) third_
+			reword $(git rev-parse fourth^0) fourth
+		EOF
+	EOS
+
+	write_script commit-editor <<-\EOS &&
+		read first_line <"$1" &&
+		echo "$first_line - edited" >"$1" &&
+
+		todo=.git/rebase-merge/git-rebase-todo &&
+
+		if test "$first_line" = second
+		then
+			stat --format=%i "$todo" >expected-ino
+		elif test "$first_line" = third_
+		then
+			ino=$(cat expected-ino) &&
+			file=$(find . -inum $ino) &&
+			if test -n "$file"
+			then
+				echo &&
+				echo "Trying to free inode $ino by moving \"$file\" out of the way" &&
+				cp -av "$file" "$file".tmp &&
+				rm -fv "$file"
+			fi &&
+
+			cat >"$todo".tmp <<-EOF &&
+			reword $(git rev-parse fifth_^0) fifth_
+			reword $(git rev-parse sixth_^0) sixth_
+			EOF
+			mv -v "$todo".tmp "$todo" &&
+
+			if test "$ino" -eq $(stat --format=%i "$todo")
+			then
+				echo "Yay! The todo list did get inode $ino, just what the sequencer is expecting!"
+			fi &&
+
+			if test -n "$file"
+			then
+				mv -v "$file".tmp "$file"
+			fi
+		fi
+	EOS
+
+	cat >expect <<-\EOF
+	sixth_ - edited
+	fifth_ - edited
+	third_ - edited
+	second - edited
+	first_
+	EOF
+'
+
+for trial in 0 1 2 3 4
+do
+	test_expect_success "demonstrate racy todo re-read problem #$trial" '
+		git reset --hard fourth &&
+		>expected-ino && # placeholder
+
+		GIT_SEQUENCE_EDITOR=./sequence-editor \
+		GIT_EDITOR=./commit-editor \
+		git rebase -i HEAD^^^ &&
+
+		git log --format=%s >actual &&
+		test_cmp expect actual
+	'
+done
+
+test_done
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help