Thread (33 messages) 33 messages, 6 authors, 2021-11-25

Re: [PATCH V5 1/4] virtio_ring: validate used buffer length

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: 2021-11-23 12:17:16
Also in: lkml

On Mon, Nov 22, 2021 at 02:50:03PM +0100, Halil Pasic wrote:
On Mon, 22 Nov 2021 14:25:26 +0800
Jason Wang [off-list ref] wrote:
quoted
On Mon, Nov 22, 2021 at 1:49 PM Halil Pasic [off-list ref] wrote:
quoted
On Mon, 22 Nov 2021 06:35:18 +0100
Halil Pasic [off-list ref] wrote:
 
quoted
quoted
I think it should be a common issue, looking at
vhost_vsock_handle_tx_kick(), it did:

len += sizeof(pkt->hdr);
vhost_add_used(vq, head, len);

which looks like a violation of the spec since it's TX.  
I'm not sure the lines above look like a violation of the spec. If you
examine vhost_vsock_alloc_pkt() I believe that you will agree that:
len == pkt->len == pkt->hdr.len
which makes sense since according to the spec both tx and rx messages
are hdr+payload. And I believe hdr.len is the size of the payload,
although that does not seem to be properly documented by the spec.  
Sorry for being unclear, what I meant is that we probably should use
zero here. TX doesn't use in buffer actually.

According to the spec, 0 should be the used length:

"and len the total of bytes written into the buffer."
Right, I was wrong. I somehow assumed this is the total length and not
just the number of bytes written.
quoted
quoted
quoted
On the other hand tx messages are stated to be device read-only (in the
spec) so if the device writes stuff, that is certainly wrong.
 
Yes.
quoted
quoted
If that is what happens.

Looking at virtqueue_get_buf_ctx_split() I'm not sure that is what
happens. My hypothesis is that we just a last descriptor is an 'in'
type descriptor (i.e. a device writable one). For tx that assumption
would be wrong.

I will have another look at this today and send a fix patch if my
suspicion is confirmed.
Yeah, I didn't remember the semantic of
vq->split.vring.used->ring[last_used].len
correctly, and in fact also how exactly the rings work. So your objection
is correct. 

Maybe updating some stuff would make it easier to not make this mistake.

For example the spec and also the linux header says:

/* le32 is used here for ids for padding reasons. */ 
struct virtq_used_elem { 
        /* Index of start of used descriptor chain. */ 
        le32 id; 
        /* Total length of the descriptor chain which was used (written to) */ 
        le32 len; 
};

I think that comment isn't as clear as it could be. I would prefer:
/* The number of bytes written into the device writable portion of the
buffer described by the descriptor chain. */

I believe "the descriptor chain which was used" includes both the
descriptors that map the device read only and the device write
only portions of the buffer described by the descriptor chain. And the
total length of that descriptor chain may be defined either as a number
of the descriptors that form the chain, or the length of the buffer.

One has to use the descriptor chain even if the whole buffer is device
read only. So "used" == "written to" does not make any sense to me.
The virtio spec actually says

Total length of the descriptor chain which was written to

without the "used" part.
Also something like
int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int bytes_written)
instead of
int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
would make it easier to read the code correctly.
I think we agree here. Patches?
quoted
quoted
If my suspicion is right something like:
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 00f64f2f8b72..efb57898920b 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -764,6 +764,7 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
        struct vring_virtqueue *vq = to_vvq(_vq);
        void *ret;
        unsigned int i;
+       bool has_in;
        u16 last_used;

        START_USE(vq);
@@ -787,6 +788,9 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
                        vq->split.vring.used->ring[last_used].id);
        *len = virtio32_to_cpu(_vq->vdev,
                        vq->split.vring.used->ring[last_used].len);
+       has_in = virtio16_to_cpu(_vq->vdev,
+                       vq->split.vring.used->ring[last_used].flags)
+                               & VRING_DESC_F_WRITE;  
Did you mean vring.desc actually? If yes, it's better not depend on
the descriptor ring which can be modified by the device. We've stored
the flags in desc_extra[].
quoted
        if (unlikely(i >= vq->split.vring.num)) {
                BAD_RING(vq, "id %u out of range\n", i);
@@ -796,7 +800,7 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
                BAD_RING(vq, "id %u is not a head!\n", i);
                return NULL;
        }
-       if (vq->buflen && unlikely(*len > vq->buflen[i])) {
+       if (has_in && q->buflen && unlikely(*len > vq->buflen[i])) {
                BAD_RING(vq, "used len %d is larger than in buflen %u\n",
                        *len, vq->buflen[i]);
                return NULL;
would fix the problem for split. I will try that out and let you know
later.  
I'm not sure I get this, in virtqueue_add_split, the buflen[i] only
contains the in buffer length.
Sorry my diff is indeed silly.
quoted
I think the fixes are:

1) fixing the vhost vsock
2) use suppress_used_validation=true to let vsock driver to validate
the in buffer length
3) probably a new feature so the driver can only enable the validation
when the feature is enabled.
Makes sense!

Regards,
Halil
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help