Re: [PATCH mlx5-next 2/7] vfio: Add an API to check migration state transition validity
From: Jason Gunthorpe <jgg@ziepe.ca>
Date: 2021-09-30 17:01:50
Also in:
kvm, linux-pci, linux-rdma, lkml
On Thu, Sep 30, 2021 at 07:51:22PM +0300, Max Gurtovoy wrote:
On 9/30/2021 7:24 PM, Jason Gunthorpe wrote:quoted
On Thu, Sep 30, 2021 at 06:32:07PM +0300, Max Gurtovoy wrote:quoted
quoted
Just prior to open device the vfio pci layer will generate a FLR to the function so we expect that post open_device has a fresh from reset fully running device state.running also mean that the device doesn't have a clue on its internal state ? or running means unfreezed and unquiesced ?The device just got FLR'd and it should be in a clean state and operating. Think the VM is booting for the first time.During the resume phase in the dst, the VM is paused and not booting. Migration SW is waiting to get memory and state from SRC. The device will start from the exact point that was in the src. it's exactly "000b => Device Stopped, not saving or resuming"
For this case qmeu should open the VFIO device and immediately issue a command to go to resuming. The kernel cannot know at open_device time which case userspace is trying to do. Due to backwards compat we assume userspace is going to boot a fresh VM.
Well, this is your design for the driver implementation. Nobody is preventing other drivers to start deserializing device state into the device during RESUMING bit on.
It is a logical model. Devices can stream the migration data directly into the internal state if they like. It just creates more conditions where they have report an error state.
So if we moved from 100b to 010b somehow, one should deserialized its buffer to the device, and then serialize it to migration region again ?
Yes.
I guess its doable since the device is freeze and quiesced. But moving from 100b to 011b is not possible, right ?
Why not? 100b to 011b is no different than going indirectly 100b -> 001b -> 011b The time spent in 001b is just negligable. Jason