Thread (29 messages) 29 messages, 4 authors, 2021-12-08

Re: [PATCH RFC v2] vfio: Documentation for the migration region

From: Cornelia Huck <cohuck@redhat.com>
Date: 2021-12-06 18:07:03
Also in: kvm

On Mon, Dec 06 2021, Jason Gunthorpe [off-list ref] wrote:
On Mon, Dec 06, 2021 at 05:03:00PM +0100, Cornelia Huck wrote:
quoted
quoted
If we're writing a specification, that's really a MAY statement,
userspace MAY issue a reset to abort the RESUMING process and return
the device to RUNNING.  They MAY also write the device_state directly,
which MAY return an error depending on various factors such as whether
data has been written to the migration state and whether that data is
complete.  If a failed transitions results in an ERROR device_state,
the user MUST issue a reset in order to return it to a RUNNING state
without closing the interface.
Are we actually writing a specification? If yes, we need to be more
clear on what is mandatory (MUST), advised (SHOULD), or allowed
(MAY). If I look at the current proposal, I'm not sure into which
category some of the statements fall.
I deliberately didn't use such formal language because this is far
from what I'd consider an acceptable spec. It is more words about how
things work and some kind of basis for agreement between user and
kernel.
We don't really need formal language, but there are too many unclear
statements, as the discussion above showed. Therefore my question: What
are we actually writing? Even if it is not a formal specification, it
still needs to be clear.
Under Linus's "don't break userspace" guideline whatever userspace
ends up doing becomes the spec the kernel is wedded to, regardless of
what we write down here.
All the more important that we actually agree before this is merged! I
don't want choices hidden deep inside the mlx5 driver dictating what
other drivers should do, it must be reasonably easy to figure out
(including what is mandatory, and what is flexible.)
Which basically means whatever mlx5 and qemu does after we go forward
is the definitive spec and we cannot change qemu in a way that is
incompatible with mlx5 or introduce a new driver that is incompatible
with qemu.
TBH, I'm not too happy with the current QEMU state, either. We need to
take a long, hard look first and figure out what we need to do to make
the QEMU support non-experimental.

We're discussing a complex topic here, and we really don't want to
perpetuate an unclear uAPI. This is where my push for more precise
statements is coming from.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help