Thread (100 messages) 100 messages, 8 authors, 2021-11-22

Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices

From: Alex Williamson <hidden>
Date: 2021-11-02 20:16:11
Also in: kvm, linux-pci

On Tue, 2 Nov 2021 13:36:10 -0300
Jason Gunthorpe [off-list ref] wrote:
On Tue, Nov 02, 2021 at 10:22:36AM -0600, Alex Williamson wrote:
quoted
quoted
quoted
There's no point at which we can do SET_IRQS other than in the
_RESUMING state.  Generally SET_IRQS ioctls are coordinated with the
guest driver based on actions to the device, we can't be mucking
with IRQs while the device is presumed running and already
generating interrupt conditions.    
We need to do it in state 000

ie resume should go 

  000 -> 100 -> 000 -> 001

With SET_IRQS and any other fixing done during the 2nd 000, after the
migration data has been loaded into the device.  
Again, this is not how QEMU works today.  
I know, I think it is a poor choice to carve out certain changes to
the device that must be preserved across loading the migration state.
quoted
quoted
The uAPI comment does not define when to do the SET_IRQS, it seems
this has been missed.

We really should fix it, unless you feel strongly that the
experimental API in qemu shouldn't be changed.  
I think the QEMU implementation fills in some details of how the uAPI
is expected to work.  
Well, we already know QEMU has problems, like the P2P thing. Is this a
bug, or a preferred limitation as designed?
quoted
MSI/X is expected to be restored while _RESUMING based on the
config space of the device, there is no intermediate step between
_RESUMING and _RUNNING.  Introducing such a requirement precludes
the option of a post-copy implementation of (_RESUMING | _RUNNING).  
Not precluded, a new state bit would be required to implement some
future post-copy.

0000 -> 1100 -> 1000 -> 1001 -> 0001

Instead of overloading the meaning of RUNNING.

I think this is cleaner anyhow.

(though I don't know how we'd structure the save side to get two
bitstreams)
The way this is supposed to work is that the device migration stream
contains the device internal state.  QEMU is then responsible for
restoring the external state of the device, including the DMA mappings,
interrupts, and config space.  It's not possible for the migration
driver to reestablish these things.  So there is a necessary division
of device state between QEMU and the migration driver.

If we don't think the uAPI includes the necessary states, doesn't
sufficiently define the states, and we're not following the existing
QEMU implementation as the guide for the intentions of the uAPI spec,
then what exactly is the proposed mlx5 migration driver implementing
and why would we even considering including it at this point?  Thanks,

Alex
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help