Re: [PATCH] net: tun: fix tun_xdp_one() for IFF_TUN mode
From: Jason Wang <jasowang@redhat.com>
Date: 2021-06-22 04:34:12
在 2021/6/21 下午6:52, David Woodhouse 写道:
On Mon, 2021-06-21 at 15:00 +0800, Jason Wang wrote:quoted
I think it's probably too late to fix? Since it should work before 043d222f93ab. The only way is to backport this fix to stable.Yeah, I assumed the fix would be backported; if not then the "does the kernel have it" check is fairly trivial. I *can* avoid it for now by just using TUNSNDBUF to reduce the sndbuf and then we never take the XDP path at all. My initial crappy hacks are slowly turning into something that I might actually want to commit to mainline (once I've fixed endianness and memory ordering issues): https://gitlab.com/openconnect/openconnect/-/compare/master...vhost I have a couple of remaining problems using vhost-net directly from userspace though. Firstly, I don't think I can set IFF_VNET_HDR on the tun device after opening it. So my model of "open the tun device, then *see* if we can use vhost to accelerate it" doesn't work.
Yes, IFF_VNET_HDR is set during TUN_SET_IFF which can't be changed afterwards.
I tried setting VHOST_NET_F_VIRTIO_NET_HDR in the vhost features instead, but that gives me a weird failure mode where it drops around half the incoming packets, and I haven't yet worked out why. Of course I don't *actually* want a vnet header at all but the vhost code really assumes that *someone* will add one; if I *don't* set VHOST_NET_F_VIRTIO_NET_HDR then it always *assumes* it can read ten bytes more from the tun socket than the 'peek' says, and barfs when it can't. (Or such was my initial half-thought-through diagnosis before I made it go away by setting IFF_VNET_HDR, at least).
Yes, vhost always assumes there's a vnet header.
Secondly, I need to pull numbers out of my posterior for the
VHOST_SET_MEM_TABLE call. This works for x86_64:
vmem->nregions = 1;
vmem->regions[0].guest_phys_addr = 4096;
vmem->regions[0].memory_size = 0x7fffffffe000;
vmem->regions[0].userspace_addr = 4096;
if (ioctl(vpninfo->vhost_fd, VHOST_SET_MEM_TABLE, vmem) < 0) {
Is there a way to bypass that and just unconditionally set a 1:1
mapping of *all* userspace address space?Memory Table is one of the basic abstraction of the vhost. Basically, you only need to map the userspace buffers. This is how DPDK virtio-user PMD did. Vhost will validate the addresses through access_ok() during VHOST_SET_MEM_TABLE. The range of all usersapce space seems architecture specific, I'm not sure if it's worth to bother. Thanks
It's possible that one or the other of those problems will result in a new advertised "feature" which is so simple (like a 1:1 map) that we can call it a bugfix and backport it along with the tun fix I already posted, and the presence of *that* can indicate that the tun bug is fixed :)