Thread (5 messages) 5 messages, 2 authors, 2022-02-24

Re: Question about the sndbuf of the tap interface with vhost-net

From: Harold Huang <hidden>
Date: 2022-02-24 07:31:29

Hi, Jason,

Jason Wang [off-list ref] 于2022年2月24日周四 12:40写道:
On Thu, Feb 24, 2022 at 12:19 PM Harold Huang [off-list ref] wrote:
quoted
Thanks for Jason's comments.

Jason Wang [off-list ref] 于2022年2月24日周四 11:23写道:
quoted
Adding netdev.

On Wed, Feb 23, 2022 at 9:46 PM Harold Huang [off-list ref] wrote:
quoted
 Sorry. The performance tested by iperf is degraded from 4.5 Gbps to
750Mbps per flow.

Harold Huang [off-list ref] 于2022年2月23日周三 21:13写道:
quoted
I see in dpdk virtio-user driver, the TUNSETSNDBUF is initialized with
INT_MAX, see: https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169
Note that Linux use INT_MAX as default sndbuf for tuntap.
quoted
quoted
It is ok because tap driver uses it to support tx baching, see this
patch: https://github.com/torvalds/linux/commit/0a0be13b8fe2cac11da2063fb03f0f39359b3069

But in tun_xdp_one, napi is not supported and I want to user napi in
tun_get_user to enable gro.
NAPI is not enabled in this path, want to send a patch to do that?
Yes, I have a patch in this path to enable NAPI and it greatly
improves TCP stream performance, from 4.5Gbsp to 9.2 Gbps per flow. I
will send it later for comments.
Good to know that.

Have you compared it with non-NAPI mode?
Do you mean using netif_rx? If so, I have tested and the performance
is about 5Gbps. The netif_rx calls process_backlog to process packet
but it does not support GRO either.
quoted
quoted
Btw, NAPI mode is used for kernel networking stack hardening at start,
but it would be interesting to see if it helps for the performance.
quoted
quoted
As I result, I change the sndbuf to a
value such as 212992 in /proc/sys/net/core/wmem_default.
Can you describe your setup in detail? Where did you run the iperf
server and client and where did you change the wmem_default?
I use dpdk-testpmd to test the vhost-net performance, such as:
dpdk-testpmd -l 0-9  -n 4
--vdev=virtio_user0,path=/dev/vhost-net,queue_size=1024,mac=00:00:0a:00:00:02
-a 0000:06:00.1 -- -i  --txd=1024 --rxd=1024

And I have changed the sndbuf in
https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169
to 212992, which is not INT_MAX anymore. I also enable NAPI in the tun
module.  The iperf server ran in the tap interface on the kernel side,
which would receive TCP stream from dpdk-testpmd.
You're do TCP stream testing among two TAP and using tesmpd to forward traffic?
The test topology is as follow:
                           ________________________
                           |                                              |
iperf-server-----tap<------->testpmd<------> ixgbe<----------->igxbe
(iperf client)
                           |_______________________|

The testpmd is used to forward traffic from another machine.
quoted
But the performance
is greatly degraded,  from 4.5 Gbps to 750Mbps. I am confused about
the perf result of the cpu core where iperf server ran, which has a
serious bottleneck: 59.86% cpu on the report_bug and  20.66% on the
module_find_bug.
This looks odd, you may want to check your perf, I don't think
module_find_bug() will run at datapath.
quoted
I use centos 8.2 with a native 4.18.0-193.el8.x86_64
kernel to test.
The kernel is kind of too old, I suggest to test recent kernel version.
I will use a recent kernel to test it later.
Thanks
quoted
quoted
quoted
quoted
But the
performance tested by iperf is greatly degraded, from 4.5 Gbps to
750Gbps per flow. I see the the iperf server consume 100% cpu core,
which should be the bottleneck of the this test. The perf top result
of iperf server cpu core is as follows:

'''
Samples: 72  of event 'cycles', 4000 Hz, Event count (approx.):
22685278 lost: 0/0 drop: 0/0
Overhead  Shared O  Symbol
  59.86%  [kernel]  [k] report_bug
  20.66%  [kernel]  [k] module_find_bug
   6.51%  [kernel]  [k] common_interrupt
   2.82%  [kernel]  [k] __slab_free
   1.48%  [kernel]  [k] copy_user_enhanced_fast_string
   1.44%  [kernel]  [k] __skb_datagram_iter
   1.42%  [kernel]  [k] notifier_call_chain
   1.41%  [kernel]  [k] irq_work_run_list
   1.41%  [kernel]  [k] update_irq_load_avg
   1.41%  [kernel]  [k] task_tick_fair
   1.41%  [kernel]  [k] cmp_ex_search
   0.16%  [kernel]  [k] __ghes_peek_estatus.isra.12
   0.02%  [kernel]  [k] acpi_os_read_memory
   0.00%  [kernel]  [k] native_apic_mem_write
'''
I am not clear about the test result. Can we change the sndbuf size in
dpdk? Is any way to enable vhost_net to use napi without changing the
tun kernel driver?
You can do this by not using INT_MAX as sndbuf.
Just mentioned above, I change the sndbuf value and I met a serious
performance degradation.
quoted
Thanks
quoted
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help