Thread (38 messages) 38 messages, 5 authors, 2024-06-06

Re: [PATCH vhost v13 05/12] virtio_ring: introduce virtqueue_dma_dev()

From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date: 2023-08-16 03:33:34
Also in: bpf, netdev

On Wed, 16 Aug 2023 10:33:34 +0800, Jason Wang [off-list ref] wrote:
On Wed, Aug 16, 2023 at 10:24 AM Xuan Zhuo [off-list ref] wrote:
quoted
On Wed, 16 Aug 2023 10:19:34 +0800, Jason Wang [off-list ref] wrote:
quoted
On Wed, Aug 16, 2023 at 10:16 AM Xuan Zhuo [off-list ref] wrote:
quoted
On Wed, 16 Aug 2023 09:13:48 +0800, Jason Wang [off-list ref] wrote:
quoted
On Tue, Aug 15, 2023 at 5:40 PM Xuan Zhuo [off-list ref] wrote:
quoted
On Tue, 15 Aug 2023 15:50:23 +0800, Jason Wang [off-list ref] wrote:
quoted
On Tue, Aug 15, 2023 at 2:32 PM Xuan Zhuo [off-list ref] wrote:
quoted

Hi, Jason

Could you skip this patch?
I'm fine with either merging or dropping this.
quoted
Let we review other patches firstly?
I will be on vacation soon, and won't have time to do this until next week.
Have a happly vacation.
quoted
But I spot two possible "issues":

1) the DMA metadata were stored in the headroom of the page, this
breaks frags coalescing, we need to benchmark it's impact
Not every page, just the first page of the COMP pages.

So I think there is no impact.
Nope, see this:

        if (SKB_FRAG_PAGE_ORDER &&
            !static_branch_unlikely(&net_high_order_alloc_disable_key)) {
                /* Avoid direct reclaim but allow kswapd to wake */
                pfrag->page = alloc_pages((gfp & ~__GFP_DIRECT_RECLAIM) |
                                          __GFP_COMP | __GFP_NOWARN |
                                          __GFP_NORETRY,
                                          SKB_FRAG_PAGE_ORDER);
                if (likely(pfrag->page)) {
                        pfrag->size = PAGE_SIZE << SKB_FRAG_PAGE_ORDER;
                        return true;
                }
        }

The comp page might be disabled due to the SKB_FRAG_PAGE_ORDER and
net_high_order_alloc_disable_key.

YES.

But if comp page is disabled. Then we only get one page each time. The pages are
not contiguous, so we don't have frags coalescing.

If you mean the two pages got from alloc_page may be contiguous. The coalescing
may then be broken. It's a possibility, but I think the impact will be small.
Let's have a simple benchmark and see?

That is ok.

I think you want to know the perf num with big traffic and the comp page
disabled.
Yes.

Hi,

Host:
	for ((i=0; i < 10; ++i)) do sockperf tp -i 192.168.122.100 -t 1000  -m 64000& done
Guest:
	03:23:12 AM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s   %ifutil
	03:23:13 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
	03:23:13 AM      ens4  61848.00      1.00 3868036.73      0.58      0.00      0.00      0.00      0.00

	tcpdump:
		03:25:01.741563 IP 192.168.122.1.29693 > 192.168.122.100.11111: UDP, length 64000
		03:25:01.741580 IP 192.168.122.1.22239 > 192.168.122.100.11111: UDP, length 64000
		03:25:01.741623 IP 192.168.122.1.22396 > 192.168.122.100.11111: UDP, length 64000

The Guest CPU util is low, every packet is 64000. But the Host vhost process is
100%. So we can not judge by the traffic or the cpu of the Guest.

So I use the kernel without my patches 0635819decaf9d60e6cacfecfebfabe3cbdddafb.

I want to count the frags coalescing num when the comp page is disabled.

	$ sh -x test.sh
	+ sysctl -w net.core.high_order_alloc_disable=1
	net.core.high_order_alloc_disable = 1
	+ sysctl net.core.high_order_alloc_disable
	net.core.high_order_alloc_disable = 1
	+ sleep 5
	+ timeout 5 bpftrace -e 'kprobe: skb_coalesce_rx_frag{@[nsecs/1000/1000/1000]=count()}'
	Attaching 1 probe...



	+ sysctl -w net.core.high_order_alloc_disable=0
	net.core.high_order_alloc_disable = 0
	+ sysctl net.core.high_order_alloc_disable
	net.core.high_order_alloc_disable = 0
	+ sleep 5
	+ timeout 5 bpftrace -e 'kprobe: skb_coalesce_rx_frag{@[nsecs/1000/1000/1000]=count()}'
	Attaching 1 probe...


	@[356]: 167020
	@[361]: 673653
	@[359]: 900844
	@[360]: 912657
	@[358]: 915853
	@[357]: 932245


We can see that the skb_coalesce_rx_frag is not called when comp page is disabled.
If the comp page is enable, there will be many frags coalescing.

So I think that my change will not have impact.

Thanks.



Thanks
quoted
Thanks.

quoted
Thanks
quoted
Thanks.

quoted
quoted
quoted
2) pre mapped DMA addresses were not reused in the case of XDP_TX/XDP_REDIRECT
Because that the tx is not the premapped mode.
Yes, we can optimize this on top.

Thanks
quoted
Thanks.
quoted
I see Michael has merge this series so I'm fine to let it go first.

Thanks
quoted
Thanks.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help