Thread (11 messages) 11 messages, 3 authors, 2017-07-19

[PATCH v7 0/3] PCI/IOMMU: Reserve IOVAs for PCI inbound memory

From: Oza Oza <hidden>
Date: 2017-05-23 05:00:56
Also in: linux-devicetree, linux-iommu, lkml

On Tue, May 23, 2017 at 12:48 AM, Alex Williamson
[off-list ref] wrote:
On Mon, 22 May 2017 22:09:39 +0530
Oza Pawandeep [off-list ref] wrote:
quoted
iproc based PCI RC and Stingray SOC has limitaiton of addressing only 512GB
memory at once.

IOVA allocation honors device's coherent_dma_mask/dma_mask.
In PCI case, current code honors DMA mask set by EP, there is no
concept of PCI host bridge dma-mask,  should be there and hence
could truly reflect the limitation of PCI host bridge.

However assuming Linux takes care of largest possible dma_mask, still the
limitation could exist, because of the way memory banks are implemented.

for e.g. memory banks:
<0x00000000 0x80000000 0x0 0x80000000>, /* 2G @ 2G */
<0x00000008 0x80000000 0x3 0x80000000>, /* 14G @ 34G */
<0x00000090 0x00000000 0x4 0x00000000>, /* 16G @ 576G */
<0x000000a0 0x00000000 0x4 0x00000000>; /* 16G @ 640G */

When run User space (SPDK) which internally uses vfio in order to access
PCI EndPoint directly.

Vfio uses huge-pages which could come from 640G/0x000000a0.
And the way vfio maps the hugepage is to have phys addr as iova,
and ends up calling VFIO_IOMMU_MAP_DMA ends up calling iommu_map,
inturn arm_lpae_map mapping iovas out of range.

So the way kernel allocates IOVA (where it honours device dma_mask) and
the way userspace gets IOVA is different.

dma-ranges = <0x43000000 0x00 0x00 0x00 0x00 0x80 0x00>; will not work.

Instead we have to go for scattered dma-ranges leaving holes.
Hence, we have to reserve IOVA allocations for inbound memory.
The patch-set caters to only addressing IOVA allocation problem.

The description here confuses me, with vfio the user owns the iova
allocation problem.  Mappings are only identity mapped if the user
chooses to do so.  The dma_mask of the device is set by the driver and
only relevant to the DMA-API.  vfio is a meta-driver and doesn't know
the dma_mask of any particular device, that's the user's job.  Is the
net result of what's happening here for the vfio case simply to expose
extra reserved regions in sysfs, which the user can then consume to
craft a compatible iova?  Thanks,

Alex
Hi Alex,

this is not a VFIO problem, the reason I have mentioned VFIO because,
wanted to bring problem
statement as a whole (which includes both kernel space and user space).
The way SPDK pipeline is set, yes mapping are identity mapped, and
whatever user space passes down IOVA,
VFIO use is as is. which is fine and expected.

But the problem is, user space physical memory (hugepages)  reside
high enough in
memory, which could be beyond PCI RC's capability.

Again, this is not VFIO's problem, neither is of user-space.
In-fact both have nothing to do with dma-mask as well.
My reference of dma-mask was for Linux IOMMU framework (not for VFIO)

Regards,
Oza.
quoted
Changes since v7:
- Robin's comment addressed
where he wanted to remove depedency between IOMMU and OF layer.
- Bjorn Helgaas's comments addressed.

Changes since v6:
- Robin's comments addressed.

Changes since v5:
Changes since v4:
Changes since v3:
Changes since v2:
- minor changes, redudant checkes removed
- removed internal review

Changes since v1:
- address Rob's comments.
- Add a get_dma_ranges() function to of_bus struct..
- Convert existing contents of of_dma_get_range function to
  of_bus_default_dma_get_ranges and adding that to the
  default of_bus struct.
- Make of_dma_get_range call of_bus_match() and then bus->get_dma_ranges.


Oza Pawandeep (3):
  OF/PCI: expose inbound memory interface to PCI RC drivers.
  IOMMU/PCI: reserve IOVA for inbound memory for PCI masters
  PCI: add support for inbound windows resources

 drivers/iommu/dma-iommu.c | 44 ++++++++++++++++++++--
 drivers/of/of_pci.c       | 96 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/probe.c       | 30 +++++++++++++--
 include/linux/of_pci.h    |  7 ++++
 include/linux/pci.h       |  1 +
 5 files changed, 170 insertions(+), 8 deletions(-)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help