Thread (17 messages) 17 messages, 4 authors, 2021-03-31

Re: [External] Re: [RFC PATCH] virtio-vsock: add description for datagram type

From: Stefan Hajnoczi <stefanha@redhat.com>
Date: 2021-03-30 10:42:25

On Mon, Mar 29, 2021 at 04:22:28PM -0700, Jiang Wang . wrote:
On Mon, Mar 29, 2021 at 2:26 AM Stefan Hajnoczi [off-list ref] wrote:
quoted
On Fri, Mar 26, 2021 at 04:40:09PM -0700, Jiang Wang . wrote:
quoted
I thought about this and discussed it with my colleague Cong Wang.
One idea is to make current asynchronous send_pkt flow to be synchronous,
then if the virtqueue is full, the function can return  ENOMEM all the way back
to the caller and the caller can check the return value of sendmsg
and slow down when necessary.

In the spec, we can put something like, if the virtqueue is full, the caller
should be notified with an error etc.

In terms of implementation, that means we will remove the current
send_pkt_work for both stream and dgram sockets. Currently, the
code path uses RCU and a work queue, then grab a mutex in the
work queue function. Since we cannot grab mutex when in rcu
critical section, we have to change RCU to a normal reference
counting mechanism. I think this is doable. The drawback is
that the reference counting in general spends a little more
cycles than the RCU, so there is a small price to pay. Another
option is to use Sleepable RCU and remove the work queue.

What do you guys think?
I think the tx code path is like this because of reliable delivery.
Maybe a separate datagram rx/tx code path would be simpler?
I thought about this too.  dgram can have a separate rx/tx
path from stream types. In this case, the the_virtio_vsock
will still be shared because the virtqueues have to be in one
structure. Then the_virtio_vsock will be protected by a rcu
and a reference counting ( or a sleepable RCU ).

In vhost_vsock_dev_release, it will wait for both rcu and another
one to be finished and then free the memory. I think this is
doable. Let me know if there is a better way to do it.
Btw, I think dgram tx code path will be quite different from
stream, but dgram rx path will be similar to stream type.
quoted
Take the datagram tx virtqueue lock, try to add the packet into the
virtqueue, and return -ENOBUFS if the virtqueue is full. Then use the
datagram socket's sndbuf accounting to prevent queuing up too many
packets. When a datagram tx virtqueue buffer is used by the device,
select queued packets for transmission.
I am not sure about the last sentence. In the new design, there will
be no other queued packets for dgram (except those in the virtqueue).
When dgram tx virtqueue is freed by the device, the driver side
just needs to free some space. No need
to trigger more transmission.
This approach drops packets when the virtqueue is full. In order to get
close to UDP semantics I think the following is necessary:

1. Packets are placed directly into the tx virtqueue when possible.
2. Packets are queued up to sndbuf bytes if the tx virtqueue is full.
3. When a full tx virtqueue gets some free space again there is an
   algorithm that selects from among the queued sockets.

This is how unreliably delivery can be somewhat fair between competing
threads. I thought you were trying to achieve this (it's similar to what
UDP does)?

Stefan
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help