Thread (51 messages) 51 messages, 5 authors, 2023-08-15

Re: [PATCH RFC net-next v5 03/14] af_vsock: support multi-transport datagrams

From: Bobby Eshleman <hidden>
Date: 2023-08-04 17:02:09
Also in: bpf, kvm, lkml, netdev

On Fri, Aug 04, 2023 at 04:11:58PM +0200, Stefano Garzarella wrote:
On Thu, Aug 03, 2023 at 06:58:24PM +0000, Bobby Eshleman wrote:
quoted
On Thu, Aug 03, 2023 at 02:42:26PM +0200, Stefano Garzarella wrote:
quoted
On Thu, Aug 03, 2023 at 12:53:22AM +0000, Bobby Eshleman wrote:
quoted
On Wed, Aug 02, 2023 at 10:24:44PM +0000, Bobby Eshleman wrote:
quoted
On Sun, Jul 23, 2023 at 12:53:15AM +0300, Arseniy Krasnov wrote:
quoted

On 19.07.2023 03:50, Bobby Eshleman wrote:
quoted
This patch adds support for multi-transport datagrams.

This includes:
- Per-packet lookup of transports when using sendto(sockaddr_vm)
- Selecting H2G or G2H transport using VMADDR_FLAG_TO_HOST and CID in
  sockaddr_vm
- rename VSOCK_TRANSPORT_F_DGRAM to VSOCK_TRANSPORT_F_DGRAM_FALLBACK
- connect() now assigns the transport for (similar to connectible
  sockets)

To preserve backwards compatibility with VMCI, some important changes
are made. The "transport_dgram" / VSOCK_TRANSPORT_F_DGRAM is changed to
be used for dgrams only if there is not yet a g2h or h2g transport that
has been registered that can transmit the packet. If there is a g2h/h2g
transport for that remote address, then that transport will be used and
not "transport_dgram". This essentially makes "transport_dgram" a
fallback transport for when h2g/g2h has not yet gone online, and so it
is renamed "transport_dgram_fallback". VMCI implements this transport.

The logic around "transport_dgram" needs to be retained to prevent
breaking VMCI:

1) VMCI datagrams existed prior to h2g/g2h and so operate under a
   different paradigm. When the vmci transport comes online, it registers
   itself with the DGRAM feature, but not H2G/G2H. Only later when the
   transport has more information about its environment does it register
   H2G or G2H.  In the case that a datagram socket is created after
   VSOCK_TRANSPORT_F_DGRAM registration but before G2H/H2G registration,
   the "transport_dgram" transport is the only registered transport and so
   needs to be used.

2) VMCI seems to require a special message be sent by the transport when a
   datagram socket calls bind(). Under the h2g/g2h model, the transport
   is selected using the remote_addr which is set by connect(). At
   bind time there is no remote_addr because often no connect() has been
   called yet: the transport is null. Therefore, with a null transport
   there doesn't seem to be any good way for a datagram socket to tell the
   VMCI transport that it has just had bind() called upon it.

With the new fallback logic, after H2G/G2H comes online the socket layer
will access the VMCI transport via transport_{h2g,g2h}. Prior to H2G/G2H
coming online, the socket layer will access the VMCI transport via
"transport_dgram_fallback".

Only transports with a special datagram fallback use-case such as VMCI
need to register VSOCK_TRANSPORT_F_DGRAM_FALLBACK.

Signed-off-by: Bobby Eshleman <redacted>
---
 drivers/vhost/vsock.c                   |  1 -
 include/linux/virtio_vsock.h            |  2 --
 include/net/af_vsock.h                  | 10 +++---
 net/vmw_vsock/af_vsock.c                | 64 ++++++++++++++++++++++++++-------
 net/vmw_vsock/hyperv_transport.c        |  6 ----
 net/vmw_vsock/virtio_transport.c        |  1 -
 net/vmw_vsock/virtio_transport_common.c |  7 ----
 net/vmw_vsock/vmci_transport.c          |  2 +-
 net/vmw_vsock/vsock_loopback.c          |  1 -
 9 files changed, 58 insertions(+), 36 deletions(-)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index ae8891598a48..d5d6a3c3f273 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -410,7 +410,6 @@ static struct virtio_transport vhost_transport = {
 		.cancel_pkt               = vhost_transport_cancel_pkt,

 		.dgram_enqueue            = virtio_transport_dgram_enqueue,
-		.dgram_bind               = virtio_transport_dgram_bind,
 		.dgram_allow              = virtio_transport_dgram_allow,

 		.stream_enqueue           = virtio_transport_stream_enqueue,
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 18cbe8d37fca..7632552bee58 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -211,8 +211,6 @@ void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val);
 u64 virtio_transport_stream_rcvhiwat(struct vsock_sock *vsk);
 bool virtio_transport_stream_is_active(struct vsock_sock *vsk);
 bool virtio_transport_stream_allow(u32 cid, u32 port);
-int virtio_transport_dgram_bind(struct vsock_sock *vsk,
-				struct sockaddr_vm *addr);
 bool virtio_transport_dgram_allow(u32 cid, u32 port);

 int virtio_transport_connect(struct vsock_sock *vsk);
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 305d57502e89..f6a0ca9d7c3e 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -96,13 +96,13 @@ struct vsock_transport_send_notify_data {

 /* Transport features flags */
 /* Transport provides host->guest communication */
-#define VSOCK_TRANSPORT_F_H2G		0x00000001
+#define VSOCK_TRANSPORT_F_H2G			0x00000001
 /* Transport provides guest->host communication */
-#define VSOCK_TRANSPORT_F_G2H		0x00000002
-/* Transport provides DGRAM communication */
-#define VSOCK_TRANSPORT_F_DGRAM		0x00000004
+#define VSOCK_TRANSPORT_F_G2H			0x00000002
+/* Transport provides fallback for DGRAM communication */
+#define VSOCK_TRANSPORT_F_DGRAM_FALLBACK	0x00000004
 /* Transport provides local (loopback) communication */
-#define VSOCK_TRANSPORT_F_LOCAL		0x00000008
+#define VSOCK_TRANSPORT_F_LOCAL			0x00000008

 struct vsock_transport {
 	struct module *module;
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index ae5ac5531d96..26c97b33d55a 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -139,8 +139,8 @@ struct proto vsock_proto = {
 static const struct vsock_transport *transport_h2g;
 /* Transport used for guest->host communication */
 static const struct vsock_transport *transport_g2h;
-/* Transport used for DGRAM communication */
-static const struct vsock_transport *transport_dgram;
+/* Transport used as a fallback for DGRAM communication */
+static const struct vsock_transport *transport_dgram_fallback;
 /* Transport used for local communication */
 static const struct vsock_transport *transport_local;
 static DEFINE_MUTEX(vsock_register_mutex);
@@ -439,6 +439,18 @@ vsock_connectible_lookup_transport(unsigned int cid, __u8 flags)
 	return transport;
 }

+static const struct vsock_transport *
+vsock_dgram_lookup_transport(unsigned int cid, __u8 flags)
+{
+	const struct vsock_transport *transport;
+
+	transport = vsock_connectible_lookup_transport(cid, flags);
+	if (transport)
+		return transport;
+
+	return transport_dgram_fallback;
+}
+
 /* Assign a transport to a socket and call the .init transport callback.
  *
  * Note: for connection oriented socket this must be called when vsk->remote_addr
@@ -475,7 +487,8 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)

 	switch (sk->sk_type) {
 	case SOCK_DGRAM:
-		new_transport = transport_dgram;
+		new_transport = vsock_dgram_lookup_transport(remote_cid,
+							     remote_flags);
I'm a little bit confused about this:
1) Let's create SOCK_DGRAM socket using vsock_create()
2) for SOCK_DGRAM it calls 'vsock_assign_transport()' and we go here, remote_cid == -1
3) I guess 'vsock_dgram_lookup_transport()' calls logic from 0002 and returns h2g for such remote cid, which is not
   correct I think...

Please correct me if i'm wrong

Thanks, Arseniy
As I understand, for the VMCI case, if transport_h2g != NULL, then
transport_h2g == transport_dgram_fallback. In either case,
vsk->transport == transport_dgram_fallback.

For the virtio/vhost case, temporarily vsk->transport == transport_h2g,
but it is unused because vsk->transport->dgram_bind == NULL.

Until SS_CONNECTED is set by connect() and vsk->transport is set
correctly, the send path is barred from using the bad transport.

I guess the recvmsg() path is a little more sketchy, and probably only
works in my test cases because h2g/g2h in the vhost/virtio case have
identical dgram_addr_init() implementations.

I think a cleaner solution is maybe checking in vsock_create() if
dgram_bind is implemented. If it is not, then vsk->transport should be
reset to NULL and a comment added explaining why VMCI requires this.

Then the other calls can begin explicitly checking for vsk->transport ==
NULL.
Actually, on further reflection here, in order for the vsk->transport to
be called in time for ->dgram_addr_init(), it is going to be necessary
to call vsock_assign_transport() in vsock_dgram_bind() anyway.

I think this means that the vsock_assign_transport() call can be removed
from vsock_create() call entirely, and yet VMCI can still dispatch
messages upon bind() calls as needed.

This would then simplify the whole arrangement, if there aren't other
unseen issues.
This sounds like a good approach.

My only question is whether vsock_dgram_bind() is always called for each
dgram socket.
No, not yet.

Currently, receivers may use vsock_dgram_recvmsg() prior to any bind,
but this should probably change.

For UDP, if we initialize a socket and call recvmsg() with no prior
bind, then the socket will be auto-bound to 0.0.0.0. I guess vsock
should probably also auto-bind in this case.
I see.
quoted
For other cases, bind may not be called prior to calls to vsock_poll() /
vsock_getname() (even if it doesn't make sense to do so), but I think it
is okay as long as vsk->transport is not used.
Makes sense.
quoted
vsock_dgram_sendmsg() always auto-binds if needed.
Okay, but the transport for sending messages, doesn't depend on the
local address, right?
That is correct.

Best,
Bobby
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help