On Wed, Feb 08, 2017 at 06:41:40PM +0100, Paolo Bonzini wrote:
On 08/02/2017 04:20, Michael S. Tsirkin wrote:
quoted
* Scatter/gather support
We can use 1 bit to chain s/g entries in a request, same as virtio 1.0:
/* This marks a buffer as continuing via the next field. */
#define VRING_DESC_F_NEXT 1
Unlike virtio 1.0, all descriptors must have distinct ID values.
Also unlike virtio 1.0, use of this flag will be an optional feature
(e.g. VIRTIO_F_DESC_NEXT) so both devices and drivers can opt out of it.
I would still prefer that we had _either_ single-direct or
multiple-indirect descriptors, i.e. no VRING_DESC_F_NEXT. I can propose
my idea for this in a separate message.
All it costs us spec-wise is a single bit :)
The cost of indirect is an extra cache miss.
We couldn't decide what's better for everyone in 1.0 days and I doubt
we'll be able to now, but yes, benchmarking is needed to make
sire it's required. Very easy to remove or not to use/support in
drivers/devices though.
quoted
* Batching descriptors:
virtio 1.0 allows passing a batch of descriptors in both directions, by
incrementing the used/avail index by values > 1. We can support this by
chaining a list of descriptors through a bit the flags field.
To allow use together with s/g, a different bit will be used.
#define VRING_DESC_F_BATCH_NEXT 0x0010
Batching works for both driver and device descriptors.
I'm still not sure how this would be useful.
So this is used at least by virtio-net mergeable buffers to combine
many buffers into a single packet.
Similarly, on transmit linux sometimes supplies packets in batches
(XMIT_MORE flag) if the other side processes them it seems nice to tell
it: there's more to come soon, if you see this it is wise to poll now.
That's why I kind of felt it's better as a standard bit.
It cannot be mandatory to
set the bit, I think, because you don't know when the host/guest is
going to read descriptors. So both host and guest always have to look
ahead one element in any case.
Right but the point is what to do if you find nothing there?
If you saw VRING_DESC_F_BATCH_NEXT it's a hint that
you should poll, there's more to come soon.
quoted
* Non power-of-2 ring sizes
As the ring simply wraps around, there's no reason to
require ring size to be power of two.
It can be made a separate feature though.
Power of 2 ring sizes are required in order to ignore the high bits of
the indices. With non-power-of-2 sizes you are forced to keep the
indices less than the ring size.
Right. So
if (unlikely(idx++ > size))
idx = 0;
OTOH ring size that's twice larger than necessary
because of power of two requirements wastes cache.
Alternatively you can do this:
quoted
* Event index would be in the range 0 to 2 * Queue Size
(to detect wrap arounds) and wrap to 0 after that.
The assumption is that each side maintains an internal
descriptor counter 0 to 2 * Queue Size that wraps to 0.
In that case, interrupt triggers when counter reaches
the given value.
but it seems more complicated than just forcing power-of-2 and ignoring
the high bits.
Thanks,
Paolo
Absolutely power of 2 lets you save a branch.
At this stage I'm just recording all the ideas
and then as a next step we can micro-benchmark prototypes
and compare.
--
MST