Re: [PATCH v2] pipe_command(): mark stdin descriptor as non-blocking
From: René Scharfe <hidden>
Date: 2022-08-10 05:48:11
Subsystem:
kernel build + files below scripts/ (unless maintained elsewhere), the rest · Maintainers:
Nathan Chancellor, Nicolas Schier, Linus Torvalds
Am 07.08.22 um 19:41 schrieb Jeff King:
On Sun, Aug 07, 2022 at 12:15:06PM +0200, René Scharfe wrote:quoted
quoted
This adds "error: pumping io failed: No space left on device" to output. Which kinda makes sense: With the pipe no longer blocking, there can be a moment when the buffer is full and writes have to be rejected. This condition should be reported with EAGAIN, though. Adding "if (len < 0 && errno == ENOSPC) continue;" after the xwrite() call in pump_io_round() lets the test pass. Perhaps the translation from Windows error code to POSIX is wrong here?So if we fix that with the patch below, t3701.57 still hangs, but this time it goes through wrapper.c::handle_nonblock() again and again. Replacing the "errno = EAGAIN" with a "return 0" to fake report a successful write of nothing instead lets the test pass. This seems to make sense -- looping in xwrite() won't help, as we need to read from the other fd first, to allow the process on the other end of the pipe to make some progress first, as otherwise the pipe buffer will stay full in this scenario. Shouldn't that be a problem on other systems as well?It doesn't happen on Linux; I suspect there's something funny either about partial writes, or about poll() on Windows. What's supposed to happen is: 1. pump_io() calls poll(), which tells us the descriptor is ready to write 2. we call xwrite(), and our actual write() call returns a partial write (i.e., reports "ret" bytes < "len" we passed in) 3. we return back to pump_io() do another round of poll(). If the other side consumed some bytes from the pipe, then we may get triggered to do another (possibly partial) write. If it didn't, and we'd get EAGAIN writing, then poll shouldn't trigger at all! So it's weird that you'd see EAGAIN in this instance. Either the underlying write() is refusing to do a partial write (and just returning an error with EAGAIN in the first place), or the poll emulation is wrong (telling us the descriptor is ready for writing when it isn't).
You're right, Windows' write needs two corrections. The helper below reports what happens when we feed a pipe with writes of different sizes. On Debian on WSL 2 (Windows Subsystem for Linux) it says: chunk size: 1 bytes 65536 total bytes written, then got EAGAIN chunk size: 1000 bytes 64000 total bytes written, then got EAGAIN chunk size: 1024 bytes 65536 total bytes written, then got EAGAIN chunk size: 100000 bytes 0 total bytes written, then got a partial write of 65536 bytes 65536 total bytes written, then got EAGAIN On Windows directly I get: chunk size: 1 bytes 8192 total bytes written, then got ENOSPC chunk size: 1000 bytes 8000 total bytes written, then got ENOSPC chunk size: 1024 bytes 8192 total bytes written, then got ENOSPC chunk size: 100000 bytes 0 total bytes written, then got ENOSPC https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html documents what we should get: Writes smaller than the buffer should be atomic, bigger writes bigger should be broken up, and the error code for a full buffer should be EAGAIN. I.e. the first example is right. So mingw_write() needs to translate ENOSPC to EAGAIN and break up huge writes instead of giving up outright.
Can you instrument pump_io_round() (or use some strace equivalent, if there is one) to see if we do a successful partial write first (which implies poll() is wrong in telling us we can write more for the second round), or if the very first write() is failing (which implies write() is wrong for returning EAGAIN when it could do a partial write).
The two corrections mentioned above together with the enable_nonblock() implementation for Windows (and the removal of "false") suffice to let t3701 pass when started directly, but it still hangs when running the whole test suite using prove. I don't have time to investigate right now, but I still don't understand how xwrite() can possibly work against a non-blocking pipe. It loops on EAGAIN, which is bad if the only way forward is to read from a different fd to allow the other process to drain the pipe buffer so that xwrite() can write again. I suspect pump_io_round() must not use xwrite() and should instead handle EAGAIN by skipping to the next fd. René --- Makefile | 1 + t/helper/test-nonblock.c | 51 ++++++++++++++++++++++++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + 4 files changed, 54 insertions(+) create mode 100644 t/helper/test-nonblock.c
diff --git a/Makefile b/Makefile
index d9c00cc05d..0bc028ca00 100644
--- a/Makefile
+++ b/Makefile@@ -751,6 +751,7 @@ TEST_BUILTINS_OBJS += test-lazy-init-name-hash.o TEST_BUILTINS_OBJS += test-match-trees.o TEST_BUILTINS_OBJS += test-mergesort.o TEST_BUILTINS_OBJS += test-mktemp.o +TEST_BUILTINS_OBJS += test-nonblock.o TEST_BUILTINS_OBJS += test-oid-array.o TEST_BUILTINS_OBJS += test-oidmap.o TEST_BUILTINS_OBJS += test-oidtree.o
diff --git a/t/helper/test-nonblock.c b/t/helper/test-nonblock.c
new file mode 100644
index 0000000000..c9288ea6ac
--- /dev/null
+++ b/t/helper/test-nonblock.c@@ -0,0 +1,51 @@ +#include "test-tool.h" +#include "compat/nonblock.h" + +static void fill_pipe(size_t write_len) +{ + void *buf = xcalloc(1, write_len); + int fds[2]; + size_t total_written = 0; + int last = 0; + + if (pipe(fds)) + die("pipe failed"); + if (enable_nonblock(fds[1])) + die("enable_nonblock failed"); + + printf("chunk size: %"PRIuMAX" bytes\n", write_len); + for (;;) { + ssize_t written = write(fds[1], buf, write_len); + if (written != write_len) + printf("%"PRIuMAX" total bytes written, then got ", + (uintmax_t)total_written); + if (written < 0) { + switch (errno) { + case EAGAIN: printf("EAGAIN\n"); break; + case ENOSPC: printf("ENOSPC\n"); break; + default: printf("errno %d\n", errno); + } + break; + } else if (written != write_len) + printf("a partial write of %"PRIuMAX" bytes\n", + (uintmax_t)written); + if (last) + break; + if (written > 0) + total_written += written; + last = !written; + }; + + close(fds[1]); + close(fds[0]); + free(buf); +} + +int cmd__nonblock(int argc, const char **argv) +{ + fill_pipe(1); + fill_pipe(1000); + fill_pipe(1024); + fill_pipe(100000); + return 0; +}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 318fdbab0c..562d7a9161 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c@@ -45,6 +45,7 @@ static struct test_cmd cmds[] = { { "match-trees", cmd__match_trees }, { "mergesort", cmd__mergesort }, { "mktemp", cmd__mktemp }, + { "nonblock", cmd__nonblock }, { "oid-array", cmd__oid_array }, { "oidmap", cmd__oidmap }, { "oidtree", cmd__oidtree },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index bb79927163..d9006a5298 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h@@ -36,6 +36,7 @@ int cmd__lazy_init_name_hash(int argc, const char **argv); int cmd__match_trees(int argc, const char **argv); int cmd__mergesort(int argc, const char **argv); int cmd__mktemp(int argc, const char **argv); +int cmd__nonblock(int argc, const char **argv); int cmd__oidmap(int argc, const char **argv); int cmd__oidtree(int argc, const char **argv); int cmd__online_cpus(int argc, const char **argv); --
2.37.1