Re: [PATCH 2/2] t0610-reftable-basics: mitigate a flaky test on cygwin
From: Ramsay Jones <hidden>
Date: 2026-01-19 17:13:57
On 19/01/2026 6:50 am, Patrick Steinhardt wrote:
On Fri, Jan 16, 2026 at 08:39:56PM +0000, Ramsay Jones wrote:quoted
Test #29 ('ref transaction: corrupted tables cause failure') started to fail intermittently for me (from v2.52.0-rc0) when running the testsuite with '-j8'. (Also, having moved to a new laptop and windows 11, rather than windows 10). If the test is run by hand, or without any parallelism, then it passes without issue. When the test fails (e.g. 1 out of 32 parallel runs) the cause is due to a permission error while corrupting a table file: ./test-lib.sh: line 1010: .git/reftable/0x000000000001-0x000000000002-d89bb8ee.ref: Permission deniedThis rings a bell. I remember that we discussed a case at some point in time where a redirect converted to `test-tool truncate` fixed a flake on Cygwin.
Indeed, the mail thread starts at: https://lore.kernel.org/git/f22c95ad-43c8-41de-8315-e707224e830b@ramsayjones.plus.com/ (local)
quoted
This corruption is done in a shell loop, directly after a 'test_commit', which uses an ': >"$f"' expression to truncate the file. Adding a sleep of one second after the 'test_commit' and before the shell loop fixes the test (it is not clear why). Replacing the redirection shell expression with a 'test-tool truncate "$f" 0' invocation also provides a fix, which could simply be another way to change the timing sufficiently to win the race. During a debug session, I tried looking at the strace output for the shell redirection: $ rm /tmp/hello; echo hello >/tmp/hello; ls -l /tmp/hello -rw-r--r-- 1 ramsay None 6 Nov 10 17:25 /tmp/hello $ $ strace -o zzz bash -c ': >/tmp/hello' $ Similarly, for the test-tool solution: $ strace -o xxx ./t/helper/test-tool truncate /tmp/hello 0 $ When comparing the output, the differences seemed to be what you would expect and, if anything, the shell redirect probably would have taken longer than the test-tool solution (many fcntl() calls to dup the stdout to the <fd>). The call to the win32 api NtCreateFile() was identical, apart from the first (FileHandle) parameter, of course.Too bad. I stil wonder whether it is the extra process that we spawn that ends up fixing the issue.
Well, a 'sleep 1' before the shell loop also fixes the issue. I hate to mention the 'windows delays updating some file attributes until after the process has exited' conspiracy theory, but ... :) (yeah, I just don't think that is possible, except ...)
quoted
In order to fix this flaky test on cygwin, despite not knowing why it works, replace the shell redirection with the above 'test-tool truncate' invocation. Helped-by: Patrick Steinhardt [off-list ref]Oh, so is this the exact case that we were talking about? If so, it might make sense to link to the mail thread so that folks can also read a bit into our discussion around this.
Indeed! I thought about referencing the email thread, but I decided that it didn't really offer any more supporting evidence than the commit message (in fact less - it doesn't mention the 'strace' scan). I can add that (again [1]), if you think it's worth it, but I just re-read the email thread and I'm not convinced it offers much extra value. So, I would rather not re-roll, but I will if you think it worth it. Let me know. [1] https://lore.kernel.org/git/f22c95ad-43c8-41de-8315-e707224e830b@ramsayjones.plus.com/ (local)
quoted
Signed-off-by: Ramsay Jones <redacted> --- t/t0610-reftable-basics.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)diff --git a/t/t0610-reftable-basics.sh b/t/t0610-reftable-basics.sh index 6575528f21..e19e036898 100755 --- a/t/t0610-reftable-basics.sh +++ b/t/t0610-reftable-basics.sh@@ -207,7 +207,7 @@ test_expect_success 'ref transaction: corrupted tables cause failure' ' test_commit file1 && for f in .git/reftable/*.ref do - : >"$f" || return 1 + test-tool truncate "$f" 0 || return 1 done && test_must_fail git update-ref refs/heads/main HEAD )In any case, if it seems to reliably fix the issue I'd say we just merge it. It's unfortunate that we haven't been able to figure out the root cause, but so be it.
Agreed! Thanks! ATB, Ramsay Jones