Re: [PATCH net-next v3 4/4] vsock/virtio: MSG_ZEROCOPY flag support
From: Stefano Garzarella <sgarzare@redhat.com>
Date: 2023-07-25 12:46:52
Also in:
kvm, lkml, virtualization
On Tue, Jul 25, 2023 at 08:39:17AM -0400, Michael S. Tsirkin wrote:
On Tue, Jul 25, 2023 at 02:28:02PM +0200, Stefano Garzarella wrote:quoted
On Tue, Jul 25, 2023 at 12:16:11PM +0300, Arseniy Krasnov wrote:quoted
On 25.07.2023 11:46, Arseniy Krasnov wrote:quoted
On 25.07.2023 11:43, Stefano Garzarella wrote:quoted
On Fri, Jul 21, 2023 at 08:09:03AM +0300, Arseniy Krasnov wrote:quoted
On 21.07.2023 00:42, Arseniy Krasnov wrote:quoted
This adds handling of MSG_ZEROCOPY flag on transmission path: if this flag is set and zerocopy transmission is possible (enabled in socket options and transport allows zerocopy), then non-linear skb will be created and filled with the pages of user's buffer. Pages of user's buffer are locked in memory by 'get_user_pages()'. Second thing that this patch does is replace type of skb owning: instead of calling 'skb_set_owner_sk_safe()' it calls 'skb_set_owner_w()'. Reason of this change is that '__zerocopy_sg_from_iter()' increments 'sk_wmem_alloc' of socket, so to decrease this field correctly proper skb destructor is needed: 'sock_wfree()'. This destructor is set by 'skb_set_owner_w()'. Signed-off-by: Arseniy Krasnov <redacted> --- Changelog: v5(big patchset) -> v1: * Refactorings of 'if' conditions. * Remove extra blank line. * Remove 'frag_off' field unneeded init. * Add function 'virtio_transport_fill_skb()' which fills both linear and non-linear skb with provided data. v1 -> v2: * Use original order of last four arguments in 'virtio_transport_alloc_skb()'. v2 -> v3: * Add new transport callback: 'msgzerocopy_check_iov'. It checks that provided 'iov_iter' with data could be sent in a zerocopy mode. If this callback is not set in transport - transport allows to send any 'iov_iter' in zerocopy mode. Otherwise - if callback returns 'true' then zerocopy is allowed. Reason of this callback is that in case of G2H transmission we insert whole skb to the tx virtio queue and such skb must fit to the size of the virtio queue to be sent in a single iteration (may be tx logic in 'virtio_transport.c' could be reworked as in vhost to support partial send of current skb). This callback will be enabled only for G2H path. For details pls see comment 'Check that tx queue...' below. include/net/af_vsock.h | 3 + net/vmw_vsock/virtio_transport.c | 39 ++++ net/vmw_vsock/virtio_transport_common.c | 257 ++++++++++++++++++------ 3 files changed, 241 insertions(+), 58 deletions(-)diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h index 0e7504a42925..a6b346eeeb8e 100644 --- a/include/net/af_vsock.h +++ b/include/net/af_vsock.h@@ -177,6 +177,9 @@ struct vsock_transport {/* Read a single skb */ int (*read_skb)(struct vsock_sock *, skb_read_actor_t); + + /* Zero-copy. */ + bool (*msgzerocopy_check_iov)(const struct iov_iter *); }; /**** CORE ****/diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 7bbcc8093e51..23cb8ed638c4 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c@@ -442,6 +442,43 @@ static void virtio_vsock_rx_done(struct virtqueue *vq)queue_work(virtio_vsock_workqueue, &vsock->rx_work); } +static bool virtio_transport_msgzerocopy_check_iov(const struct iov_iter *iov) +{ + struct virtio_vsock *vsock; + bool res = false; + + rcu_read_lock(); + + vsock = rcu_dereference(the_virtio_vsock); + if (vsock) {Just noted, what about the following to reduce the indentation? if (!vsock) { goto out; }no {} pls
ooops, true, too much QEMU code today, but luckily checkpatch would have spotted it ;-)
quoted
... ... out: rcu_read_unlock(); return res;indentation is quite modest here. Not sure goto is worth it.
It's a pattern we follow a lot in this file, and I find the early return/goto more readable. Anyway, I don't have a strong opinion, it's fine the way it is now, actually we don't have too many indentations for now in this function. Thanks, Stefano