Thread (1 message) 1 message, 1 author, 2020-02-10

Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

From: Christian Borntraeger <hidden>
Date: 2020-02-10 10:10:03
Also in: kvm, linux-next, lkml

Possibly related (same subject, not in this thread)


On 10.02.20 10:40, Eugenio Perez Martin wrote:
Hi Christian.

I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more data? Your configuration (libvirt or qemu line), and host's dmesg output if any?
I do the following in the guest:
ping -c 200 -f somevalidip; reboot
sometimes I need to do that multiple times and sometimes I do not get a guest crash but host dmesg like

Guest moved used index from 0 to 292

xml is pretty simple

    <interface type='direct'>
      <mac address='52:54:00:7c:2c:f3'/>
      <source dev='encbd00' mode='bridge'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0001'/>
    </interface>


Reverting this patch seems to make both problems go away.

Thanks!

On Fri, Feb 7, 2020 at 9:13 AM Christian Borntraeger <borntraeger@de.ibm.com <mailto:borntraeger@de.ibm.com>> wrote:



    On 07.02.20 08:58, Michael S. Tsirkin wrote:
    > On Fri, Feb 07, 2020 at 08:47:14AM +0100, Christian Borntraeger wrote:
    >> Also adding Cornelia.
    >>
    >>
    >> On 06.02.20 23:17, Michael S. Tsirkin wrote:
    >>> On Thu, Feb 06, 2020 at 04:12:21PM +0100, Christian Borntraeger wrote:
    >>>>
    >>>>
    >>>> On 06.02.20 15:22, eperezma@redhat.com <mailto:eperezma@redhat.com> wrote:
    >>>>> Hi Christian.
    >>>>>
    >>>>> Could you try this patch on top of ("38ced0208491 vhost: use batched version by default")?
    >>>>>
    >>>>> It will not solve your first random crash but it should help with the lost of network connectivity.
    >>>>>
    >>>>> Please let me know how does it goes.
    >>>>
    >>>>
    >>>> 38ced0208491 + this seem to be ok.
    >>>>
    >>>> Not sure if you can make out anything of this (and the previous git bisect log)
    >>>
    >>> Yes it does - that this is just bad split-up of patches, and there's
    >>> still a real bug that caused worse crashes :)
    >>>
    >>> So I just pushed batch-v4.
    >>> I expect that will fail, and bisect to give us
    >>>     vhost: batching fetches
    >>> Can you try that please?
    >>>
    >>
    >> yes.
    >>
    >> eccb852f1fe6bede630e2e4f1a121a81e34354ab is the first bad commit
    >> commit eccb852f1fe6bede630e2e4f1a121a81e34354ab
    >> Author: Michael S. Tsirkin <mst@redhat.com <mailto:mst@redhat.com>>
    >> Date:   Mon Oct 7 06:11:18 2019 -0400
    >>
    >>     vhost: batching fetches
    >>     
    >>     With this patch applied, new and old code perform identically.
    >>     
    >>     Lots of extra optimizations are now possible, e.g.
    >>     we can fetch multiple heads with copy_from/to_user now.
    >>     We can get rid of maintaining the log array.  Etc etc.
    >>     
    >>     Signed-off-by: Michael S. Tsirkin <mst@redhat.com <mailto:mst@redhat.com>>
    >>
    >>  drivers/vhost/test.c  |  2 +-
    >>  drivers/vhost/vhost.c | 39 ++++++++++++++++++++++++++++++++++-----
    >>  drivers/vhost/vhost.h |  4 +++-
    >>  3 files changed, 38 insertions(+), 7 deletions(-)
    >>
    >
    >
    > And the symptom is still the same - random crashes
    > after a bit of traffic, right?

    random guest crashes after a reboot of the guests. As if vhost would still
    write into now stale buffers.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help