Re: [PATCH RERESEND 00/11] splice(file<>pipe) I/O on file as-if O_NONBLOCK
From: Jens Axboe <axboe@kernel.dk>
Date: 2023-12-14 19:07:01
Also in:
linux-fsdevel, linux-s390, linux-trace-kernel, lkml
On 12/14/23 11:44 AM, Ahelenia Ziemia?ska wrote:
First: https://lore.kernel.org/lkml/cover.1697486714.git.nabijaczleweli@nabijaczleweli.xyz/t/#u (local) Resend: https://lore.kernel.org/lkml/1cover.1697486714.git.nabijaczleweli@nabijaczleweli.xyz/t/#u (local) Resending again per https://lore.kernel.org/lkml/20231214093859.01f6e2cd@kernel.org/t/#u (local) Hi! As it stands, splice(file -> pipe): 1. locks the pipe, 2. does a read from the file, 3. unlocks the pipe. For reading from regular files and blcokdevs this makes no difference. But if the file is a tty or a socket, for example, this means that until data appears, which it may never do, every process trying to read from or open the pipe enters an uninterruptible sleep, and will only exit it if the splicing process is killed. This trivially denies service to: * any hypothetical pipe-based log collexion system * all nullmailer installations * me, personally, when I'm pasting stuff into qemu -serial chardev:pipe This follows: 1. https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u (local) 2. a security@ thread rooted in <irrrblivicfc7o3lfq7yjm2lrxq35iyya4gyozlohw24gdzyg7@azmluufpdfvu> 3. https://nabijaczleweli.xyz/content/blogn_t/011-linux-splice-exclusion.html Patches were posted and then discarded on principle or funxionality, all in all terminating in Linus postingquoted
But it is possible that we need to just bite the bullet and say "copy_splice_read() needs to use a non-blocking kiocb for the IO".This does that, effectively making splice(file -> pipe) request (and require) O_NONBLOCK on reads fron the file: this doesn't affect splicing from regular files and blockdevs, since they're always non-blocking (and requesting the stronger "no kernel sleep" IOCB_NOWAIT is non-sensical),
Not sure how you got the idea that regular files or block devices is always non-blocking, this is certainly not true without IOCB_NOWAIT. Without IOCB_NOWAIT, you can certainly be waiting for previous IO to complete.
but always returns -EINVAL for ttys. Sockets behave as expected from O_NONBLOCK reads: splice if there's data available else -EAGAIN. This should all pretty much behave as-expected.
Should it? Seems like there's a very high risk of breaking existing use cases here. Have you at all looked into the approach of enabling splice to/from _without_ holding the pipe lock? That, to me, would seem like a much saner approach, with the caveat that I have not looked into that at all so there may indeed be reasons why this is not feasible. -- Jens Axboe