Thread (100 messages) 100 messages, 8 authors, 2021-11-22

RE: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices

From: Shameerali Kolothum Thodi <hidden>
Date: 2021-11-02 11:19:30
Also in: kvm, linux-pci

-----Original Message-----
From: Jason Gunthorpe [mailto:jgg@nvidia.com]
Sent: 01 November 2021 17:25
To: Alex Williamson <redacted>; Shameerali Kolothum
Thodi [off-list ref]
Cc: Cornelia Huck <cohuck@redhat.com>; Yishai Hadas <yishaih@nvidia.com>;
bhelgaas@google.com; saeedm@nvidia.com; linux-pci@vger.kernel.org;
kvm@vger.kernel.org; netdev@vger.kernel.org; kuba@kernel.org;
leonro@nvidia.com; kwankhede@nvidia.com; mgurtovoy@nvidia.com;
maorg@nvidia.com; Dr. David Alan Gilbert [off-list ref]
Subject: Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver
for mlx5 devices

On Fri, Oct 29, 2021 at 04:06:21PM -0600, Alex Williamson wrote:
quoted
quoted
Right now we are focused on the non-P2P cases, which I think is a
reasonable starting limitation.
It's a reasonable starting point iff we know that we need to support
devices that cannot themselves support a quiescent state.  Otherwise it
would make sense to go back to work on the uAPI because I suspect the
implications to userspace are not going to be as simple as "oops, can't
migrate, there are two devices."  As you say, there's a universe of
devices that run together that don't care about p2p and QEMU will be
pressured to support migration of those configurations.
I agree with this, but I also think what I saw in the proposed hns
driver suggests it's HW cannot do quiescent, if so this is the first
counter-example to the notion it is a universal ability?

hns people: Can you put your device in a state where it is operating,
able to accept and respond to MMIO, and yet guarentees it generates no
DMA transactions?
AFAIK, I am afraid we cannot guarantee that as per our current implementation.
At present in !RUNNING state we are putting the device in to a PAUSE state so it
will complete the current request and keep the remaining ones in queue. But it can
still receive a new request which will trigger the PAUSE state exit and resume the
operation.

So I guess, it is possible to corrupt the migration if user space misbehaves.

Thanks,
Shameer
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help