Thread (40 messages) 40 messages, 6 authors, 2022-10-05

RE: [PATCH v5 2/2] virtio-net: use mtu size as buffer length for big packets

From: Parav Pandit <hidden>
Date: 2022-09-07 19:06:55
Also in: virtualization

From: Michael S. Tsirkin <mst@redhat.com>
Sent: Wednesday, September 7, 2022 2:16 PM

On Wed, Sep 07, 2022 at 04:12:47PM +0000, Parav Pandit wrote:
quoted
quoted
From: Michael S. Tsirkin <mst@redhat.com>
Sent: Wednesday, September 7, 2022 10:40 AM

On Wed, Sep 07, 2022 at 02:33:02PM +0000, Parav Pandit wrote:
quoted
quoted
From: Michael S. Tsirkin <mst@redhat.com>
Sent: Wednesday, September 7, 2022 10:30 AM
[..]
quoted
quoted
quoted
actually how does this waste space? Is this because your
device does not have INDIRECT?
VQ is 256 entries deep.
Driver posted total of 256 descriptors.
Each descriptor points to a page of 4K.
These descriptors are chained as 4K * 16.
So without indirect then? with indirect each descriptor can
point to
16 entries.
With indirect, can it post 256 * 16 size buffers even though vq
depth is 256
entries?
quoted
I recall that total number of descriptors with direct/indirect
descriptors is
limited to vq depth.

quoted
Was there some recent clarification occurred in the spec to clarify this?

This would make INDIRECT completely pointless.  So I don't think we
ever had such a limitation.
The only thing that comes to mind is this:

	A driver MUST NOT create a descriptor chain longer than the Queue
Size of
	the device.

but this limits individual chain length not the total length of all chains.
Right.
I double checked in virtqueue_add_split() which doesn't count table
entries towards desc count of VQ for indirect case.
quoted
With indirect descriptors without this patch the situation is even worse
with memory usage.
quoted
Driver will allocate 64K * 256 = 16MB buffer per VQ, while needed (and
used) buffer is only 2.3 Mbytes.

Yes. So just so we understand the reason for the performance improvement
is this because of memory usage? Or is this because device does not have
INDIRECT?
Because of shallow queue of 16 entries deep.
With driver turn around time to repost buffers, device is idle without any RQ buffers.
With this improvement, device has 85 buffers instead of 16 to receive packets.

Enabling indirect in device can help at cost of 7x higher memory per VQ in the guest VM.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help