Re: [PATCH v2] 9p/trans_fd: perform read/write with TIF_SIGPENDING set
From: Christian Schoenebeck <linux_oss@crudebyte.com>
Date: 2022-09-01 15:23:36
On Samstag, 27. August 2022 08:11:48 CEST Tetsuo Handa wrote:
syzbot is reporting hung task at p9_fd_close() [1], for p9_mux_poll_stop()
from p9_conn_destroy() from p9_fd_close() is failing to interrupt already
started kernel_read() from p9_fd_read() from p9_read_work() and/or
kernel_write() from p9_fd_write() from p9_write_work() requests.
Since p9_socket_open() sets O_NONBLOCK flag, p9_mux_poll_stop() does not
need to interrupt kernel_{read,write}(). However, since p9_fd_open() does
not set O_NONBLOCK flag, but pipe blocks unless signal is pending,
p9_mux_poll_stop() needs to interrupt kernel_{read,write}() when the file
descriptor refers to a pipe. In other words, pipe file descriptor needs
to be handled as if socket file descriptor. We somehow need to interrupt
kernel_{read,write}() on pipes.
If we can tolerate "possibility of breaking userspace program by setting
O_NONBLOCK flag on userspace-supplied file descriptors" and "possibility
of race window that userspace program clears O_NONBLOCK flag between after
automatically setting O_NONBLOCK flag and before calling
kernel_{read,write}()", we could automatically set O_NONBLOCK flag
immediately before calling kernel_{read,write}().
A different approach, which this patch is doing, is to surround
kernel_{read,write}() with set_thread_flag(TIF_SIGPENDING) and
recalc_sigpending(). This might be ugly and bit costly, but does not
touch userspace-supplied file descriptors.So the intention in this alternative approach is to allow user space apps still being able to perform blocking I/O, while at the same time making the kernel thread interruptible to fix this hung task issue, correct?
quoted hunk ↗ jump to hunk
Link: https://syzkaller.appspot.com/bug?extid=8b41a1365f1106fd0f33 [1] Reported-by: syzbot <redacted> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: syzbot <redacted> --- Although syzbot tested that this patch solves hung task problem, syzbot cannot verify that this patch will not break functionality of p9 users. Please test before applying this patch. net/9p/trans_fd.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index e758978b44be..e2f4e3245a80 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c@@ -256,11 +256,13 @@ static int p9_fd_read(struct p9_client *client, void*v, int len) if (!ts) return -EREMOTEIO; - if (!(ts->rd->f_flags & O_NONBLOCK)) - p9_debug(P9_DEBUG_ERROR, "blocking read ...\n"); - pos = ts->rd->f_pos; + /* Force non-blocking read() even without O_NONBLOCK. */ + set_thread_flag(TIF_SIGPENDING); ret = kernel_read(ts->rd, v, len, &pos); + spin_lock_irq(¤t->sighand->siglock); + recalc_sigpending(); + spin_unlock_irq(¤t->sighand->siglock);
Is the recalc_sigpending() block here actually needed? The TIF_SIGPENDING flag is already cleared by net/9p/client.c, no?
quoted hunk ↗ jump to hunk
if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN) client->status = Disconnected; return ret;@@ -423,10 +425,12 @@ static int p9_fd_write(struct p9_client *client, void*v, int len) if (!ts) return -EREMOTEIO; - if (!(ts->wr->f_flags & O_NONBLOCK)) - p9_debug(P9_DEBUG_ERROR, "blocking write ...\n"); - + /* Force non-blocking write() even without O_NONBLOCK. */ + set_thread_flag(TIF_SIGPENDING); ret = kernel_write(ts->wr, v, len, &ts->wr->f_pos); + spin_lock_irq(¤t->sighand->siglock); + recalc_sigpending(); + spin_unlock_irq(¤t->sighand->siglock); if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN) client->status = Disconnected; return ret;