Thread (255 messages) 255 messages, 11 authors, 2021-10-29

RE: [RFC 01/20] iommu/iommufd: Add /dev/iommu core

From: Liu, Yi L <hidden>
Date: 2021-10-15 09:18:26
Also in: linux-iommu, lkml

From: Jason Gunthorpe <jgg@nvidia.com>
Sent: Tuesday, September 21, 2021 11:42 PM

On Sun, Sep 19, 2021 at 02:38:29PM +0800, Liu Yi L wrote:
quoted
/dev/iommu aims to provide a unified interface for managing I/O address
spaces for devices assigned to userspace. This patch adds the initial
framework to create a /dev/iommu node. Each open of this node returns an
iommufd. And this fd is the handle for userspace to initiate its I/O
address space management.

One open:
- We call this feature as IOMMUFD in Kconfig in this RFC. However this
  name is not clear enough to indicate its purpose to user. Back to 2010
  vfio even introduced a /dev/uiommu [1] as the predecessor of its
  container concept. Is that a better name? Appreciate opinions here.

[1]
https://lore.kernel.org/kvm/4c0eb470.1HMjondO00NIvFM6%25pugs@cisco.co
m/
quoted
Signed-off-by: Liu Yi L <redacted>
 drivers/iommu/Kconfig           |   1 +
 drivers/iommu/Makefile          |   1 +
 drivers/iommu/iommufd/Kconfig   |  11 ++++
 drivers/iommu/iommufd/Makefile  |   2 +
 drivers/iommu/iommufd/iommufd.c | 112
++++++++++++++++++++++++++++++++
quoted
 5 files changed, 127 insertions(+)
 create mode 100644 drivers/iommu/iommufd/Kconfig
 create mode 100644 drivers/iommu/iommufd/Makefile
 create mode 100644 drivers/iommu/iommufd/iommufd.c
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 07b7c25cbed8..a83ce0acd09d 100644
+++ b/drivers/iommu/Kconfig
@@ -136,6 +136,7 @@ config MSM_IOMMU

 source "drivers/iommu/amd/Kconfig"
 source "drivers/iommu/intel/Kconfig"
+source "drivers/iommu/iommufd/Kconfig"

 config IRQ_REMAP
 	bool "Support for Interrupt Remapping"
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index c0fb0ba88143..719c799f23ad 100644
+++ b/drivers/iommu/Makefile
@@ -29,3 +29,4 @@ obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
 obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o io-pgfault.o
 obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
+obj-$(CONFIG_IOMMUFD) += iommufd/
diff --git a/drivers/iommu/iommufd/Kconfig
b/drivers/iommu/iommufd/Kconfig
quoted
new file mode 100644
index 000000000000..9fb7769a815d
+++ b/drivers/iommu/iommufd/Kconfig
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config IOMMUFD
+	tristate "I/O Address Space management framework for passthrough
devices"
quoted
+	select IOMMU_API
+	default n
+	help
+	  provides unified I/O address space management framework for
+	  isolating untrusted DMAs via devices which are passed through
+	  to userspace drivers.
+
+	  If you don't know what to do here, say N.
diff --git a/drivers/iommu/iommufd/Makefile
b/drivers/iommu/iommufd/Makefile
quoted
new file mode 100644
index 000000000000..54381a01d003
+++ b/drivers/iommu/iommufd/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_IOMMUFD) += iommufd.o
diff --git a/drivers/iommu/iommufd/iommufd.c
b/drivers/iommu/iommufd/iommufd.c
quoted
new file mode 100644
index 000000000000..710b7e62988b
+++ b/drivers/iommu/iommufd/iommufd.c
@@ -0,0 +1,112 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * I/O Address Space Management for passthrough devices
+ *
+ * Copyright (C) 2021 Intel Corporation
+ *
+ * Author: Liu Yi L <yi.l.liu@intel.com>
+ */
+
+#define pr_fmt(fmt)    "iommufd: " fmt
+
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/miscdevice.h>
+#include <linux/mutex.h>
+#include <linux/iommu.h>
+
+/* Per iommufd */
+struct iommufd_ctx {
+	refcount_t refs;
+};
A private_data of a struct file should avoid having a refcount (and
this should have been a kref anyhow)

Use the refcount on the struct file instead.

In general the lifetime models look overly convoluted to me with
refcounts being used as locks and going in all manner of directions.

- No refcount on iommufd_ctx, this should use the fget on the fd.
  The driver facing version of the API has the driver holds a fget
  inside the iommufd_device.

- Put a rwlock inside the iommufd_ioas that is a
  'destroying_lock'. The rwlock starts out unlocked.

  Acquire from the xarray is
   rcu_lock()
   ioas = xa_load()
   if (ioas)
      if (down_read_trylock(&ioas->destroying_lock))
all good suggestions, will refine accordingly. Here destroying_lock is a
rw_semaphore. right? Since down_read_trylock() accepts a rwsem.

Thanks,
Yi Liu
           // success
  Unacquire is just up_read()

  Do down_write when the ioas is to be destroyed, do not return ebusy.

 - Delete the iommufd_ctx->lock. Use RCU to protect load, erase/alloc does
   not need locking (order it properly too, it is in the wrong order), and
   don't check for duplicate devices or dev_cookie duplication, that
   is user error and is harmless to the kernel.
quoted
+static int iommufd_fops_release(struct inode *inode, struct file *filep)
+{
+	struct iommufd_ctx *ictx = filep->private_data;
+
+	filep->private_data = NULL;
unnecessary
quoted
+	iommufd_ctx_put(ictx);
+
+	return 0;
+}
+
+static long iommufd_fops_unl_ioctl(struct file *filep,
+				   unsigned int cmd, unsigned long arg)
+{
+	struct iommufd_ctx *ictx = filep->private_data;
+	long ret = -EINVAL;
+
+	if (!ictx)
+		return ret;
impossible
quoted
+
+	switch (cmd) {
+	default:
+		pr_err_ratelimited("unsupported cmd %u\n", cmd);
don't log user triggerable events

Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help