Thread (4 messages) 4 messages, 2 authors, 2d ago

Re: [PATCH v4 1/3] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature

From: Zhiping Zhang <hidden>
Date: 2026-05-23 01:04:13
Also in: dri-devel, kvm, linux-pci, linux-rdma

On Thu, May 21, 2026 at 3:24 PM Alex Williamson [off-list ref] wrote:
quoted
On Thu, 21 May 2026 16:04:12 -0600
Alex Williamson [off-list ref] wrote:
quoted
On Tue, 19 May 2026 13:13:49 -0700
Zhiping Zhang [off-list ref] wrote:
quoted
Add a dma-buf get_tph callback for exporters to return TPH
(TLP Processing Hints) metadata, and add VFIO_DEVICE_FEATURE_DMA_BUF_TPH
so userspace can attach that metadata to a VFIO-exported dma-buf.
This should be two patches, the first extending the dma-buf framework
for the get_tph callback for explicit approval from dma-buf maintainers
(who are not even copied here).  The second the vfio-pci implementation
of get_tph.
quoted
8-bit ST and 16-bit Extended ST are distinct PCIe TPH namespaces; the
uAPI carries both with explicit validity flags so importers get the
value matching their requested width. SET is write-once per dma-buf;
the existing VFIO_DEVICE_FEATURE_DMA_BUF uAPI is unchanged.
I didn't see what motivated this write-once change, I thought we
understood that it was a userspace problem that the tph values need to
be set before providing the dma-buf fd to the importer and that races
relative to that are a userspace ordering problem.  Write-once seems
unnecessarily restrictive and there's no justification provided here.
quoted
Signed-off-by: Zhiping Zhang <redacted>
---
 drivers/vfio/pci/vfio_pci_core.c   |   3 +
 drivers/vfio/pci/vfio_pci_dmabuf.c | 134 +++++++++++++++++++++++++++--
 drivers/vfio/pci/vfio_pci_priv.h   |  12 +++
 include/linux/dma-buf.h            |  21 +++++
 include/uapi/linux/vfio.h          |  35 ++++++++
 5 files changed, 198 insertions(+), 7 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 3f8d093aacf8..94aa6dd95701 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1534,6 +1534,9 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
            return vfio_pci_core_feature_token(vdev, flags, arg, argsz);
    case VFIO_DEVICE_FEATURE_DMA_BUF:
            return vfio_pci_core_feature_dma_buf(vdev, flags, arg, argsz);
+   case VFIO_DEVICE_FEATURE_DMA_BUF_TPH:
+           return vfio_pci_core_feature_dma_buf_tph(vdev, flags, arg,
+                                                    argsz);
    default:
            return -ENOTTY;
    }
diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
index f87fd32e4a01..be1c65385670 100644
--- a/drivers/vfio/pci/vfio_pci_dmabuf.c
+++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
@@ -19,7 +19,24 @@ struct vfio_pci_dma_buf {
    u32 nr_ranges;
    struct kref kref;
    struct completion comp;
-   u8 revoked : 1;
+   /*
+    * TPH metadata published by VFIO_DEVICE_FEATURE_DMA_BUF_TPH and
+    * consumed by the @get_tph dma-buf callback.
+    *
+    * @tph_flags is the publish/consume gate: writers populate
+    * @steering_tag, @steering_tag_ext and @ph first, then store
+    * @tph_flags with smp_store_release(); readers do
+    * smp_load_acquire(&tph_flags) before accessing the value fields.
+    * @tph_flags == 0 means "TPH not set". Writers publish a non-zero
+    * value only once per dma-buf and serialize via vdev->memory_lock;
+    * readers stay lockless to avoid AB-BA against the dma_resv_lock held
+    * by importers.
+    */
Can you outline the ABBA hazard, I'm not seeing it.  You're acquiring
memory_lock in the feature SET and dma_resv_lock doesn't appear to be
held when calling .get_tph().  There's a lot of lockless complication
here balanced on this claim of avoiding a hazard that doesn't appear
present.
quoted
+   u32 tph_flags;
+   u16 steering_tag_ext;
+   u8 steering_tag;
+   u8 ph;
+   bool revoked;
If we still used memory_lock for tph, these could be:

      u8 tph_st_valid:1; /* memory_lock */
      u8 tph_st_ext_valid:1; /* memory_lock */
      u8 tph_ph:2; /* memory_lock */
      u8 tph_st;
      u16 tph_st_ext;
      u8 revoked:1; /* dma_resv_lock */

The existing change of @revoked from bitfield to bool has no rationale
noted for it in the commit log.
On second thought, what dependency does anything here have on
memory_lock?  I think we're jumping through hoops to avoid a lock we
don't even need.  If we just want to serialize SET vs get_tph we could
have a mutex on the dma-buf structure, or use RCU if we want to manage
it locklessly and make sure get_tph always sees a fully consistent set
of values.  Thanks,

Alex
Agreed, we don't need memory_lock in this path. For v5 I'll instead add a
struct mutex lock to struct vfio_pci_dma_buf and take it in SET,
get_tph,
and around the priv->vdev = NULL store in cleanup.

Thanks,
Zhiping
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help