Re: [PATCH mlx5-next 2/7] vfio: Add an API to check migration state transition validity
From: Jason Gunthorpe <jgg@ziepe.ca>
Date: 2021-09-30 14:47:58
Also in:
kvm, linux-pci, linux-rdma, lkml
On Thu, Sep 30, 2021 at 12:34:19PM +0300, Max Gurtovoy wrote:
quoted
When we add the migration extension this cannot change, so after open_device() the device should be operational.if it's waiting for incoming migration blob, it is not running.
It cannot be waiting for a migration blob after open_device, that is not backwards compatible. Just prior to open device the vfio pci layer will generate a FLR to the function so we expect that post open_device has a fresh from reset fully running device state.
quoted
The reported state in the migration region should accurately reflect what the device is currently doing. If the device is operational then it must report running, not stopped.STOP in migration meaning.
As Alex and I have said several times STOP means the internal state is not allowed to change.
quoted
driver will see RESUMING toggle off so it will trigger a de-serializationYou mean stop serialization ?
No, I mean it will take all the migration data that has been uploaded through the migration region and de-serialize it into active device state.
quoted
driver will see SAVING toggled on so it will serialize the new state (either the pre-copy state or the post-copy state dpending on the running bit)lets leave the bits and how you implement the state numbering aside.
You've missed the point. This isn't a FSM. It is a series of three control bits that we have assigned logical meaning their combinatoins. The algorithm I gave is a control centric algorithm not a state centric algorithm and matches the direction Alex thought this was being designed for.
If you finish resuming you can move to a new state (that we should add) => RESUMED.
It is not a state machine. Once you stop prentending this is implementing a FSM Alex's position makes perfect sense. Jason