Re: [PATCH RFC] vfio: Documentation for the migration region

From: Jason Gunthorpe <jgg@nvidia.com>
Date: 2021-11-25 16:16:51
Also in: kvm

On Thu, Nov 25, 2021 at 01:27:12PM +0100, Cornelia Huck wrote:

On Wed, Nov 24 2021, Jason Gunthorpe [off-list ref] wrote:

quoted

On Wed, Nov 24, 2021 at 05:55:49PM +0100, Cornelia Huck wrote:

quoted

What I meant to say: If we give userspace the flexibility to operate
this, we also must give different device types some flexibility. While
subchannels will follow the general flow, they'll probably condense/omit
some steps, as I/O is quite different to PCI there.

I would say no - migration is general, no device type should get to
violate this spec.  Did you have something specific in mind? There is
very little PCI specific here already

I'm not really thinking about violating the spec, but more omitting
things that do not really apply to the hardware. For example, it is
really easy to shut up a subchannel, we don't really need to wait until
nothing happens anymore, and it doesn't even have MMIO.

I've never really looked closely at the s390 mdev drivers..

What does something like AP even do anyhow? The ioctl handler doesn't
do anything, there is no mmap hook, how does the VFIO userspace
interact with this thing?

quoted

In general, userspace can issue a VFIO_DEVICE_RESET ioctl and recover the
device back to device_state RUNNING. When a migration driver executes this
ioctl it should discard the data window and set migration_state to RUNNING as
part of resetting the device to a clean state. This must happen even if the
migration_state has errored. A freshly opened device FD should always be in
the RUNNING state.

Can the state immediately change from RUNNING to ERROR again?

Immediately? State change can only happen in response to the ioctl or
the reset.

""The migration_state cannot change asynchronously, upon writing the
migration_state the driver will either keep the current state and return
failure, return failure and go to ERROR, or succeed and go to the new state.""

quoted

However, a device may not compromise system integrity if it is subjected to a
MMIO. It can not trigger an error TLP, it can not trigger a Machine Check, and
it can not compromise device isolation.

"Machine Check" may be confusing to readers coming from s390; there, the
device does not trigger the machine check, but the channel subsystem
does, and we cannot prevent it. Maybe we can word it more as an example,
so readers get an idea what the limits in this state are?

Lets say x86 machine check then which is a kernel-fatal event.

Although I would like to see some more feedback from others, I think
this is already a huge step in the right direction.

Thanks, I made all your other changes

Will send a v2 next week

Jason

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help