Thread (50 messages) 50 messages, 7 authors, 2021-07-21

Re: [dpdk-dev] [PATCH v5 3/4] vhost: support async dequeue for split ring

From: Hu, Jiayu <hidden>
Date: 2021-07-16 13:45:15

-----Original Message-----
From: David Marchand <redacted>
Sent: Friday, July 16, 2021 4:15 PM
To: Hu, Jiayu <redacted>
Cc: Maxime Coquelin <redacted>; Ma, WenwuX
[off-list ref]; dev@dpdk.org; Xia, Chenbo
[off-list ref]; Jiang, Cheng1 [off-list ref]; Wang,
YuanX [off-list ref]
Subject: Re: [dpdk-dev] [PATCH v5 3/4] vhost: support async dequeue for
split ring

On Wed, Jul 14, 2021 at 8:50 AM Hu, Jiayu [off-list ref] wrote:
quoted
quoted
Are we ensuring packets are not reordered with this way of working?
There is a threshold can be set by users. If set it to 0, which
presents all packet copies assigned to the DMA, the packets sent from
the guest will not be reordered.
- I find the rte_vhost_async_channel_register() signature with a bitfield quite
ugly.
We are writing sw, this is not mapped to hw stuff... but ok this is a different
topic.
I have reworked the structure. Here is the link:
http://patches.dpdk.org/project/dpdk/patch/1626465089-17052-3-git-send-email-jiayu.hu@intel.com/

- I don't like this threshold, this is too low level and most users will only see
the shiny aspect "better performance" without understanding the
consequences.
By default, it leaves the door open to a _bad_ behavior, that is packet
reordering.
At a very minimum, strongly recommend to use 0 in the API.
That's a good point. But there are some reasons of open this value to users:
- large packets will block small packets, like control packets of TCP.
- dma efficiency. We usually see 20~30% drops because of offloading 64B copies to
dma engine.
- the threshold is not only related to hardware, but also application. The value decides
which copies are assigned to which worker, the CPU or the DMA. As async vhost works
in an asynchronous way, the threshold value decides how many works can be done in
parallel. It's not only about what DMA engine and what platform we use, but also what
computation the CPU has been assigned. Different users will have different values.

I totally understand the worry about reordering. But simple iperf tests show positive
results with setting threshold in our lab. We need more careful tests before modifying
it, IMHO.

Thanks,
Jiayu


--
David Marchand
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help