Thread (100 messages) 100 messages, 8 authors, 2021-11-22

Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices

From: Alex Williamson <hidden>
Date: 2021-10-21 21:47:38
Also in: kvm, linux-pci

On Thu, 21 Oct 2021 11:34:00 +0200
Cornelia Huck [off-list ref] wrote:
On Wed, Oct 20 2021, Alex Williamson [off-list ref] wrote:
quoted
On Wed, 20 Oct 2021 15:59:19 -0300
Jason Gunthorpe [off-list ref] wrote:
 
quoted
On Wed, Oct 20, 2021 at 10:52:30AM -0600, Alex Williamson wrote:
  
quoted
I'm wondering if we're imposing extra requirements on the !_RUNNING
state that don't need to be there.  For example, if we can assume that
all devices within a userspace context are !_RUNNING before any of the
devices begin to retrieve final state, then clearing of the _RUNNING
bit becomes the device quiesce point and the beginning of reading
device data is the point at which the device state is frozen and
serialized.  No new states required and essentially works with a slight
rearrangement of the callbacks in this series.  Why can't we do that?    
It sounds worth checking carefully. I didn't come up with a major
counter scenario.

We would need to specifically define which user action triggers the
device to freeze and serialize. Reading pending_bytes I suppose?  
The first read of pending_bytes after clearing the _RUNNING bit would
be the logical place to do this since that's what we define as the start
of the cycle for reading the device state.

"Freezing" the device is a valid implementation, but I don't think it's
strictly required per the uAPI.  For instance there's no requirement
that pending_bytes is reduced by data_size on each iteratio; we
specifically only define that the state is complete when the user reads
a pending_bytes value of zero.  So a driver could restart the device
state if the device continues to change (though it's debatable whether
triggering an -errno on the next migration region access might be a
more supportable approach to enforce that userspace has quiesced
external access).  
Hm, not so sure. From my reading of the uAPI, transitioning from
pre-copy to stop-and-copy (i.e. clearing _RUNNING) implies that we
freeze the device (at least, that's how I interpret "On state transition
from pre-copy to stop-and-copy, the driver must stop the device, save
the device state and send it to the user application through the
migration region.")
"[S]end it to the user application through the migration region" is
certainly not something that's encompassed just by clearing the _RUNNING
bit.  There's a sequence of operations there.  If the device is
quiesced for outbound DMA and frozen from inbound DMA (or can
reasonably expect no further inbound DMA) before the user reads the
data, I think that meets the description.

We can certainly clarify the spec in the process if we agree that we
can do this without adding another state bit.

I recall that we previously suggested a very strict interpretation of
clearing the _RUNNING bit, but again I'm questioning if that's a real
requirement or simply a nice-to-have feature for some undefined
debugging capability.  In raising the p2p DMA issue, we can see that a
hard stop independent of other devices is not really practical but I
also don't see that introducing a new state bit solves this problem any
more elegantly than proposed here.  Thanks,

Alex
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help