Re: updated: kvm networking todo wiki
From: Rusty Russell <hidden>
Date: 2013-06-03 00:32:43
Also in:
kvm, qemu-devel, virtualization
Anthony Liguori [off-list ref] writes:
"Michael S. Tsirkin" [off-list ref] writes:quoted
On Thu, May 30, 2013 at 08:40:47AM -0500, Anthony Liguori wrote:quoted
Stefan Hajnoczi [off-list ref] writes:quoted
On Thu, May 30, 2013 at 7:23 AM, Rusty Russell [off-list ref] wrote:quoted
Anthony Liguori [off-list ref] writes:quoted
Rusty Russell [off-list ref] writes:quoted
On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:quoted
FWIW, I think what's more interesting is using vhost-net as a networking backend with virtio-net in QEMU being what's guest facing. In theory, this gives you the best of both worlds: QEMU acts as a first line of defense against a malicious guest while still getting the performance advantages of vhost-net (zero-copy).It would be an interesting idea if we didn't already have the vhost model where we don't need the userspace bounce.The model is very interesting for QEMU because then we can use vhost as a backend for other types of network adapters (like vmxnet3 or even e1000). It also helps for things like fault tolerance where we need to be able to control packet flow within QEMU.(CC's reduced, context added, Dmitry Fleytman added for vmxnet3 thoughts). Then I'm really confused as to what this would look like. A zero copy sendmsg? We should be able to implement that today. On the receive side, what can we do better than readv? If we need to return to userspace to tell the guest that we've got a new packet, we don't win on latency. We might reduce syscall overhead with a multi-dimensional readv to read multiple packets at once?Sounds like recvmmsg(2).Could we map this to mergable rx buffers though? Regards, Anthony LiguoriYes because we don't have to complete buffers in order.What I meant though was for GRO, we don't know how large the received packet is going to be. Mergable rx buffers lets us allocate a pool of data for all incoming packets instead of allocating max packet size * max packets. recvmmsg expects an array of msghdrs and I presume each needs to be given a fixed size. So this seems incompatible with mergable rx buffers.
Good point. You'd need to build 64k buffers to pass to recvmmsg, then reuse the parts it didn't touch on the next call. This limits us to about a 16th of what we could do with an interface which understood buffer merging, but I don't know how much that would matter in practice. We'd need some benchmarks.... Cheers, Rusty.