Thread (44 messages) 44 messages, 6 authors, 2021-06-24

Re: [PATCH RESEND x3 v9 1/9] iov_iter: add copy_struct_from_iter()

From: Omar Sandoval <osandov@osandov.com>
Date: 2021-06-23 19:33:45
Also in: linux-btrfs, linux-fsdevel

On Wed, Jun 23, 2021 at 11:28:15AM -0700, Linus Torvalds wrote:
On Wed, Jun 23, 2021 at 10:49 AM Omar Sandoval [off-list ref] wrote:
quoted
Al, Linus, what do you think? Is there a path forward for this series as
is?
So the "read from user space in order to write" is a no-go for me. It
completely violates what a "read()" system call should do. It also
entirely violates what an iovec can and should do.

And honestly, if Al hates the "first iov entry" model, I'm not sure I
want to merge that version - I personally find it fine, but Al is
effectively the iov-iter maintainer.

I do worry a bit about the "first iov entry" simply because it might
work for "writev2()" when given virtual user space addresses - but I
think it's conceptually broken for things like direct-IO which might
do things by physical address, and what is a contiguous user space
virtual address is not necessarily a contiguous physical address.

Yes, the filesystem can - and does - hide that path by basically not
doing direct-IO on the first entry at all, and just treat is very
specially in the front end of the IO access, but that only reinforces
the whole "this is not at all like read/write".

Similar issues might crop up in other situations, ie splice etc, where
it's not at all obvious that the iov_iter boundaries would be
maintained as it moves through the system.

So while I personally find the "first iov entry" model fairly
reasonable, I think Dave is being disingenuous when he says that it
looks like a normal read/write. It very much does not. The above is
quite fundamental.
quoted
 I'd be happy to have this functionality merged in any form, but I do
think that this approach with preadv2/pwritev2 using iov_len is decent
relative to the alternatives.
As mentioned, I find it acceptable. I'm completely unimpressed with
Dave's argument, but ioctl's aren't perfect either, so weak or not,
that argument being bogus doesn't necessarily mean that the iovec
entry model is wrong.

That said, thinking about exactly the fact that I don't think a
translation from iovec to anything else can be truly valid, I find the
iter_is_iovec() case to be the only obviously valid one.

Which gets me back to: how can any of the non-iovec alternatives ever
be valid? You did mention having missed ITER_XARRAY, but my question
is more fundamental than that. How could a non-iter_is_iovec ever be
valid? There are no possible interfaces that can generate such a thing
sanely.
I only implemented the bvec and kvec cases for completeness, since
copy_struct_from_iter() would appear to be a generic helper. At least
for RWF_ENCODED, a bvec seems pretty bogus, but it doesn't seem too
far-flung to imagine an in-kernel user of RWF_ENCODED that uses a kvec.

One other option that we haven't considered is ditching the
copy_struct_from_user() semantics and going the simpler route of adding
some reserved space to the end of struct encoded_iov:

struct encoded_iov {
	__aligned_u64 len;
	__aligned_u64 unencoded_len;
	__aligned_u64 unencoded_offset;
	__u32 compression;
	__u32 encryption;
	__u8 reserved[32];
};

Then we can do an unconditional copy_from_user_full(sizeof(struct
encoded_iov)) and check the reserved space in the typical fashion.

(And in the unlikely case that we use up all of that space with
extensions, I suppose we could have an RWF_ENCODED2 with a matching
struct encoded_iov2.)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help