Re: [RFC PATCH] virtio: (Partially) enable suspend/resume support

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: 2010-11-05 10:15:36

On Wed, Oct 06, 2010 at 05:24:18PM +0530, Amit Shah wrote:

On (Tue) Oct 05 2010 [17:23:19], Michael S. Tsirkin wrote:

quoted

On Tue, Oct 05, 2010 at 07:15:31PM +0530, Amit Shah wrote:

quoted

+
+	spin_lock_irqsave(&vp_dev->lock, flags);
+	list_for_each_entry(info, &vp_dev->virtqueues, node) {
+		/* Select the queue we're interested in */
+		iowrite16(info->queue_index,
+			  vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
+
+		/* Update the last idx we sent data in */
+		iowrite16(virtqueue_get_avail_idx(info->vq),
+			  vp_dev->ioaddr + VIRTIO_PCI_AVAIL_IDX);

Interesting. Could we just reset the ring instead?
I think this would also solve the qemu problem you
outline, won't it?

The problem here is qemu doesn't "know" we went into suspend and came
out of it.  When going to suspend, qemu could've received a kick
notification and would've been just about to process some queue entries.
Now when we resume and reset the ring, qemu could crash anyway seeing
invalid index values.

Hmm, I don't completely understand this.  When there's a reset I expect
this to discard any previous kicks. No?

I'm talking of a situation like this:


	Guest					Host

   virtqueue_add_buf()
   virtqueue_kick()
					virtqueue_pop() (in progress)

    -->   suspend


Now there will be some buffers in the virtqueue but the host wouldn't
know on the next resume.  So we want to keep the ring state in the
current state so that the host continues consuming from where it left
off.

I still don't see it.  why don't we reset on resume?
If there's a reset host must either discard or
flush out operations in progress.

Now this wouldn't crash on resume for virtio-net, but for virtio-serial
(which uses chardevs) and also for virtio-block, I guess there are more
problems.

For example, for virtio-serial, if the machine is re-started with a
chardev connected, the virtqueue_num_heads() function gets called, which
results in the 'Guest moved used index' message.

		Amit

Guest moved used index means a vring related bug?
When the ring is reset both host and guest should start from 0,
and index must be 0 too.

-- 
MST

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help