Re: [PATCH] t3429: try to protect against a potential racy todo file problem

From: Phillip Wood <hidden>
Date: 2019-11-25 16:40:52
Subsystem: the rest · Maintainer: Linus Torvalds

Possibly related (same subject, not in this thread)

2019-11-25 · Re: [PATCH] t3429: try to protect against a potential racy todo file problem · Junio C Hamano <hidden>

On 25/11/2019 15:15, SZEDER Gábor wrote:

On Mon, Nov 25, 2019 at 02:43:07PM +0000, Phillip Wood wrote:

quoted

On 25/11/2019 13:18, SZEDER Gábor wrote:

quoted

On Sun, Nov 24, 2019 at 10:10:21PM +0100, SZEDER Gábor wrote:

quoted

To notice a changed todo file the sequencer stores the file's stat
data in its 'struct todo_list' instance, and compares it with the
file's current stat data after 'reword', 'squash' and 'exec'
instructions.  If the two stat data doesn't match, it re-reads the
todo file.

Sounds simple, but there are some subtleties going on here:

   - The 'struct todo_list' holds the stat data from the time when the
     todo file was last read.

   - This stat data in 'struct todo_list' is not updated when the
     sequencer itself writes the todo file.

   - Before executing each instruction during an interactive rebase,
     the sequencer always updates the todo file by removing the
     just-about-to-be-executed instruction.  This changes the file's
     size and inode [1].

Consequently, when the sequencer looks at the stat data after a
'reword', 'squash' or 'exec' instruction, it most likely finds that
they differ, even when the user didn't modify the todo list at all!
This is not an issue in practice, it just wastes a few cycles on
re-reading the todo list that matches what the sequencer already has
in memory anyway.

It can be much more than just a few cycles, because the total number
of parsed instructions from all the todo file re-reads can go
quadratic with the number of rebased commits.

The simple test below runs 'git rebase -i -x' on 1000 commits, which
takes over 14seconds to run.  If it doesn't re-read the todo file at
all (I simply deleted the whole condition block checking the stat data
and re-reading) it runs for only ~2.5secs.

Just another angle to consider...

I know dscho was keen to avoid re-parsing the list all the time [1]
presumably because of the quadratic behavior. (He also assumed most people
were using ns stat times [2] but that appears not to be the case)

Nanosecond file timestamp comparisons are only enabled by the USE_NSEC
macro, which is only defined if the USE_NSEC Makefile knob is enabled,
but that is not enabled by default.

Then there is the related NO_NSEC Makefile knob:

   # Define NO_NSEC if your "struct stat" does not have "st_ctim.tv_nsec"
   # available.  This automatically turns USE_NSEC off.

As Dscho mentioned in [2], we do disable nanosecond file timestamp
comparisons in 'config.mak.uname' on a handful of platforms by setting
NO_NSEC.  This, however, does not mean that nanosec timestamps are
enabled on other platforms by default.

quoted

Could we
just compare the text of the todo list on disk to whats in todo->buf.buf
(with an appropriate offset)? That would avoid parsing the text and looking
up all the commits with get_oid()

Comparing the contents without parsing is still quadratic in the size
of the todo list, though I suppose with a much lower constant factor
than actually parsing it.

The patch below (assuming thunderbird doesn't mangle it) reduces the 
time to run your bulk commit test from 30s to 7s, if I delete the 
condition block which checks the stat data it takes 4.7s on my (somewhat 
ancient) laptop. So there is a cost to the string comparison approach 
but it's much less that the full todo list parsing.

Best Wishes

Phillip

--- >8 ---
diff --git a/sequencer.c b/sequencer.c
index 8952cfa89b..a3efdae0a5 100644
--- a/sequencer.c
+++ b/sequencer.c

@@ -3909,12 +3909,17 @@ static int pick_commits(struct repository *r,
                                                         arg, item->arg_len,
                                                         opts, res, 0);
                 } else if (check_todo && !res) {
-                       struct stat st;
-
-                       if (stat(get_todo_path(opts), &st)) {
-                               res = error_errno(_("could not stat '%s'"),
+                       int offset;
+                       struct strbuf buf = STRBUF_INIT;
+                       if (strbuf_read_file(&buf, get_todo_path(opts),

8096) < 0)
+                               res = error_errno(_("could not read '%s'"),
                                                   get_todo_path(opts));
-                       } else if (match_stat_data(&todo_list->stat, &st)) {
+
+                       offset = get_item_line_offset(todo_list,
+                                                     todo_list->current 
+ 1);
+                       if (buf.len != todo_list->buf.len  - offset ||
+                           memcmp(buf.buf, todo_list->buf.buf + offset, 
buf.len)) {
+                               fputs("re-reading todo list\n", stderr);
                                 /* Reread the todo file if it has 
changed. */
                                 todo_list_release(todo_list);
                                 if (read_populate_todo(r, todo_list, opts))

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help