Re: [PATCH] vhost: fix segfault on bad descriptor address.
From: Yuanhan Liu <hidden>
Date: 2016-07-06 12:23:12
On Wed, Jul 06, 2016 at 02:19:12PM +0300, Ilya Maximets wrote:
On 01.07.2016 10:35, Yuanhan Liu wrote:quoted
Hi, Sorry for the long delay. On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:quoted
In current implementation guest application can reinitialize vrings by executing start after stop. In the same time host application can still poll virtqueue while device stopped in guest and it will crash with segmentation fault while vring reinitialization because of dereferencing of bad descriptor addresses.Yes, you are right that vring will be reinitialized after restart. But even though, I don't see the reason it will cause a vhost crash, since the reinitialization will reset all the vring memeory by 0: memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size); That means those bad descriptors will be skipped, safely, at vhost side by: if (unlikely(desc->len < dev->vhost_hlen)) return -1;quoted
OVS crash for example: <------------------------------------------------------------------------> [test-pmd inside guest VM] testpmd> port stop all Stopping ports... Checking link statuses... Port 0 Link Up - speed 10000 Mbps - full-duplex Done testpmd> port config all rxq 2 testpmd> port config all txq 2 testpmd> port start all Configuring Port 0 (socket 0) Port 0: 52:54:00:CB:44:C8 Checking link statuses... Port 0 Link Up - speed 10000 Mbps - full-duplex Done [OVS on host] Program received signal SIGSEGV, Segmentation fault. rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.hInteresting, so it bypasses the above check since desc->len is non-zero while desc->addr is zero. The size (2056) also looks weird. Do you mind to check this issue a bit deeper, say why desc->addr is zero, however, desc->len is not?OK. I checked this few more times.
Thanks!
Actually, I see, that desc->addr is
not zero. All desc memory looks like some rubbish:
<------------------------------------------------------------------------------>
(gdb)
#3 copy_desc_to_mbuf (mbuf_pool=0x7fe9da9f4480, desc_idx=65363,
m=0x7fe9db269400, vq=0x7fe9fff7bac0, dev=0x7fe9fff7cbc0)
desc = 0x2aabc00ff530
desc_addr = 0
mbuf_offset = 0
prev = 0x7fe9db269400
nr_desc = 1
desc_offset = 12
cur = 0x7fe9db269400
hdr = 0x0
desc_avail = 1012591375
mbuf_avail = 1526
cpy_len = 1526
(gdb) p *desc
$2 = {addr = 8507655620301055744, len = 1012591387, flags = 3845, next = 48516}
<------------------------------------------------------------------------------>
And 'desc_addr' equals zero because 'gpa_to_vva' just can't map this huge
address to host's.
Scenario was the same. SIGSEGV received right after 'port start all'.
Another thought:
Actually, there is a race window between 'memset' in guest and reading
of 'desc->len' and 'desc->addr' on host. So, it's possible to read non
zero 'len' and zero 'addr' right after that.That's also what I was thinking, that it should the only reason caused such issue.
But you're right, this case should be very rare.
Yes, it should be very rare. What troubles me is that seems you can reproduce this issue very easily, that I doubt it's caused by this rare race. The reason could be something else? --yliu