Re: [PATCH v3 3/5] vhost_net: remove virtio_net_hdr validation, let tun/tap do it themselves
From: Jason Wang <jasowang@redhat.com>
Date: 2021-06-29 03:43:26
在 2021/6/29 上午7:29, David Woodhouse 写道:
On Mon, 2021-06-28 at 12:23 +0100, David Woodhouse wrote:quoted
To be clear: from the point of view of my *application* I don't care about any of this; my only motivation here is to clean up the kernel behaviour and make life easier for potential future users. I have found a setup that works in today's kernels (even though I have to disable XDP, and have to use a virtio header that I don't want), and will stick with that for now, if I actually commit it to my master branch at all: https://gitlab.com/openconnect/openconnect/-/commit/0da4fe43b886403e6 I might yet abandon it because I haven't *yet* seen it go any faster than the code which just does read()/write() on the tun device from userspace. And without XDP or zerocopy it's not clear that it could ever give me any benefit that I couldn't achieve purely in userspace by having a separate thread to do tun device I/O. But we'll see...I managed to do some proper testing, between EC2 c5 (Skylake) virtual instances. The kernel on a c5.metal can transmit (AES128-SHA1) ESP at about 1.2Gb/s from iperf, as it seems to be doing it all from the iperf thread. Before I started messing with OpenConnect, it could transmit 1.6Gb/s. When I pull in the 'stitched' AES+SHA code from OpenSSL instead of doing the encryption and the HMAC in separate passes, I get to 2.1Gb/s. Adding vhost support on top of that takes me to 2.46Gb/s, which is a decent enough win.
Interesting, I think the latency should be improved as well in this case. Thanks
That's with OpenConnect taking 100% CPU, iperf3 taking 50% of another one, and the vhost kernel thread taking ~20%.