Re: updated: kvm networking todo wiki

From: Rusty Russell <hidden>
Date: 2013-06-03 00:32:43
Also in: kvm, qemu-devel, virtualization

Anthony Liguori [off-list ref] writes:

"Michael S. Tsirkin" [off-list ref] writes:

quoted

On Thu, May 30, 2013 at 08:40:47AM -0500, Anthony Liguori wrote:

quoted

Stefan Hajnoczi [off-list ref] writes:

quoted

On Thu, May 30, 2013 at 7:23 AM, Rusty Russell [off-list ref] wrote:

quoted

Anthony Liguori [off-list ref] writes:

quoted

Rusty Russell [off-list ref] writes:

quoted

On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:

quoted

FWIW, I think what's more interesting is using vhost-net as a networking
backend with virtio-net in QEMU being what's guest facing.

In theory, this gives you the best of both worlds: QEMU acts as a first
line of defense against a malicious guest while still getting the
performance advantages of vhost-net (zero-copy).

It would be an interesting idea if we didn't already have the vhost
model where we don't need the userspace bounce.

The model is very interesting for QEMU because then we can use vhost as
a backend for other types of network adapters (like vmxnet3 or even
e1000).

It also helps for things like fault tolerance where we need to be able
to control packet flow within QEMU.

(CC's reduced, context added, Dmitry Fleytman added for vmxnet3 thoughts).

Then I'm really confused as to what this would look like.  A zero copy
sendmsg?  We should be able to implement that today.

On the receive side, what can we do better than readv?  If we need to
return to userspace to tell the guest that we've got a new packet, we
don't win on latency.  We might reduce syscall overhead with a
multi-dimensional readv to read multiple packets at once?

Sounds like recvmmsg(2).

Could we map this to mergable rx buffers though?

Regards,

Anthony Liguori

Yes because we don't have to complete buffers in order.

What I meant though was for GRO, we don't know how large the received
packet is going to be.  Mergable rx buffers lets us allocate a pool of
data for all incoming packets instead of allocating max packet size *
max packets.

recvmmsg expects an array of msghdrs and I presume each needs to be
given a fixed size.  So this seems incompatible with mergable rx
buffers.

Good point.  You'd need to build 64k buffers to pass to recvmmsg, then
reuse the parts it didn't touch on the next call.  This limits us to
about a 16th of what we could do with an interface which understood
buffer merging, but I don't know how much that would matter in
practice.  We'd need some benchmarks....

Cheers,
Rusty.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help