RE: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices

[PATCH V2 mlx5-next 00/14] Add mlx5 live migration driver · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
[PATCH V2 mlx5-next 01/14] PCI/IOV: Add pci_iov_vf_id() to get VF index · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
[PATCH V2 mlx5-next 02/14] net/mlx5: Reuse exported virtfn index function call · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
[PATCH V2 mlx5-next 04/14] PCI/IOV: Add pci_iov_get_pf_drvdata() to allow VF reaching the drvdata of a PF · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
[PATCH V2 mlx5-next 05/14] net/mlx5: Expose APIs to get/put the mlx5 core device · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
[PATCH V2 mlx5-next 03/14] net/mlx5: Disable SRIOV before PF removal · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
[PATCH V2 mlx5-next 06/14] vdpa/mlx5: Use mlx5_vf_get_core_dev() to get PF device · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
Re: [PATCH V2 mlx5-next 06/14] vdpa/mlx5: Use mlx5_vf_get_core_dev() to get PF device · Max Gurtovoy <mgurtovoy@nvidia.com> · 2021-10-19
Re: [PATCH V2 mlx5-next 06/14] vdpa/mlx5: Use mlx5_vf_get_core_dev() to get PF device · Yishai Hadas <yishaih@nvidia.com> · 2021-10-20
[PATCH V2 mlx5-next 07/14] vfio: Fix VFIO_DEVICE_STATE_SET_ERROR macro · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
[PATCH V2 mlx5-next 08/14] vfio: Add a macro for VFIO_DEVICE_STATE_ERROR · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
Re: [PATCH V2 mlx5-next 08/14] vfio: Add a macro for VFIO_DEVICE_STATE_ERROR · Alex Williamson <hidden> · 2021-10-19
Re: [PATCH V2 mlx5-next 08/14] vfio: Add a macro for VFIO_DEVICE_STATE_ERROR · Alex Williamson <hidden> · 2021-10-19
Re: [PATCH V2 mlx5-next 08/14] vfio: Add a macro for VFIO_DEVICE_STATE_ERROR · Yishai Hadas <yishaih@nvidia.com> · 2021-10-20
[PATCH V2 mlx5-next 09/14] vfio/pci_core: Make the region->release() function optional · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
[PATCH V2 mlx5-next 10/14] net/mlx5: Introduce migration bits and structures · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
[PATCH V2 mlx5-next 11/14] vfio/mlx5: Expose migration commands over mlx5 device · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
[PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-19
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-19
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-19
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-19
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Yishai Hadas <yishaih@nvidia.com> · 2021-10-20
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-20
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-20
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-20
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Cornelia Huck <cohuck@redhat.com> · 2021-10-21
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-21
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-25
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-25
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-25
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-26
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-26
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-26
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-26
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-27
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-27
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Cornelia Huck <cohuck@redhat.com> · 2021-10-28
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-29
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Yishai Hadas <yishaih@nvidia.com> · 2021-10-29
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-28
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-28
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Cornelia Huck <cohuck@redhat.com> · 2021-10-29
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Yishai Hadas <yishaih@nvidia.com> · 2021-10-29
RE: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Shameerali Kolothum Thodi <hidden> · 2021-10-29
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-29
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-29
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-01
RE: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Shameerali Kolothum Thodi <hidden> · 2021-11-02
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-11-02
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-02
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-11-02
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-02
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-11-02
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-03
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-11-03
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-03
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-11-03
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Cornelia Huck <cohuck@redhat.com> · 2021-11-04
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Cornelia Huck <cohuck@redhat.com> · 2021-11-05
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Cornelia Huck <cohuck@redhat.com> · 2021-11-16
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-05
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-11-05
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-15
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-11-16
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-16
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-11-16
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-17
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-11-18
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-22
RE: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · "Tian, Kevin" <kevin.tian@intel.com> · 2021-11-08
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-08
RE: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · "Tian, Kevin" <kevin.tian@intel.com> · 2021-11-09
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-09
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Dr. David Alan Gilbert <hidden> · 2021-10-25
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-25
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Dr. David Alan Gilbert <hidden> · 2021-10-25
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-25
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Dr. David Alan Gilbert <hidden> · 2021-10-26
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-26
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-26
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-26
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Alex Williamson <hidden> · 2021-10-26
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Dr. David Alan Gilbert <hidden> · 2021-10-26
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-26
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Yishai Hadas <yishaih@nvidia.com> · 2021-10-20
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-20
Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices · Yishai Hadas <yishaih@nvidia.com> · 2021-10-21
[PATCH V2 mlx5-next 13/14] vfio/pci: Expose vfio_pci_aer_err_detected() · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
[PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler · Yishai Hadas <yishaih@nvidia.com> · 2021-10-19
Re: [PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler · Alex Williamson <hidden> · 2021-10-19
Re: [PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-19
Re: [PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler · Yishai Hadas <yishaih@nvidia.com> · 2021-10-20
Re: [PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-20
Re: [PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler · Alex Williamson <hidden> · 2021-10-20
Re: [PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler · Jason Gunthorpe <jgg@nvidia.com> · 2021-10-20
Re: [PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler · Alex Williamson <hidden> · 2021-10-20
Re: [PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler · Yishai Hadas <yishaih@nvidia.com> · 2021-10-21
vfio migration discussions (was: [PATCH V2 mlx5-next 00/14] Add mlx5 live migration driver) · Cornelia Huck <cohuck@redhat.com> · 2021-11-17
Re: vfio migration discussions (was: [PATCH V2 mlx5-next 00/14] Add mlx5 live migration driver) · Jason Gunthorpe <jgg@nvidia.com> · 2021-11-17

From: "Tian, Kevin" <kevin.tian@intel.com>
Date: 2021-11-08 08:53:28
Also in: kvm, linux-pci

From: Jason Gunthorpe <jgg@nvidia.com>
Sent: Tuesday, October 26, 2021 11:19 PM

On Tue, Oct 26, 2021 at 08:42:12AM -0600, Alex Williamson wrote:

quoted

This is also why I don't like it being so transparent as it is
something userspace needs to care about - especially if the HW cannot
support such a thing, if we intend to allow that.

Userspace does need to care, but userspace's concern over this should
not be able to compromise the platform and therefore making VF
assignment more susceptible to fatal error conditions to comply with a
migration uAPI is troublesome for me.

It is an interesting scenario.

I think it points that we are not implementing this fully properly.

The !RUNNING state should be like your reset efforts.

All access to the MMIO memories from userspace should be revoked
during !RUNNING

This assumes that vCPUs must be stopped before !RUNNING is entered 
in virtualization case. and it is true today.

But it may not hold when talking about guest SVA and I/O page fault [1].
The problem is that the pending requests may trigger I/O page faults
on guest page tables. W/o running vCPUs to handle those faults, the
quiesce command cannot complete draining the pending requests
if the device doesn't support preempt-on-fault (at least it's the case for
some Intel and Huawei devices, possibly true for most initial SVA
implementations). 

Of course migrating guest SVA requires more changes as discussed in [1]. 
Here just want to point out this forward-looking requirement so any 
definition change in this thread won't break that usage.

[1] https://lore.kernel.org/qemu-devel/06cb5bfd-f6f8-b61b-1a7e-60a9ae2f8fac@nvidia.com/T/ (local)
(p.s. 'stop device' in [1] means 'quiesce device' in this thread)

Thanks,
Kevin

All VMAs zap'd.

All IOMMU peer mappings invalidated.

The kernel should directly block userspace from causing a MMIO TLP
before the device driver goes to !RUNNING.

Then the question of what the device does at this edge is not
relevant as hostile userspace cannot trigger it.

The logical way to implement this is to key off running and
block/unblock MMIO access when !RUNNING.

To me this strongly suggests that the extra bit is the correct way
forward as the driver is much simpler to implement and understand if
RUNNING directly controls the availability of MMIO instead of having
an irregular case where !RUNNING still allows MMIO but only until a
pending_bytes read.

Given the complexity of this can we move ahead with the current
mlx5_vfio and Yishai&co can come with some followup proposal to split
the freeze/queice and block MMIO?

Jason

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help