Thread (16 messages) 16 messages, 2 authors, 2018-06-28

Re: [RFC] Add support for device dma mask

From: Alejandro Lucero <hidden>
Date: 2018-06-27 16:52:44

On Wed, Jun 27, 2018 at 2:24 PM, Burakov, Anatoly <anatoly.burakov@intel.com
wrote:
On 27-Jun-18 11:13 AM, Alejandro Lucero wrote:

quoted
On Wed, Jun 27, 2018 at 9:17 AM, Burakov, Anatoly <
anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>> wrote:

    On 26-Jun-18 6:37 PM, Alejandro Lucero wrote:

        This RFC tries to handle devices with addressing limitations.
        NFP devices
        4000/6000 can just handle addresses with 40 bits implying
        problems for handling
        physical address when machines have more than 1TB of memory. But
        because how
        iovas are configured, which can be equivalent to physical
        addresses or based on
        virtual addresses, this can be a more likely problem.

        I tried to solve this some time ago:

        https://www.mail-archive.com/dev@dpdk.org/msg45214.html
        <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>

        It was delayed because there was some changes in progress with
        EAL device
        handling, and, being honest, I completely forgot about this
        until now, when
        I have had to work on supporting NFP devices with DPDK and
        non-root users.

        I was working on a patch for being applied on main DPDK branch
        upstream, but
        because changes to memory initialization during the last months,
        this can not
        be backported to stable versions, at least the part where the
        hugepages iovas
        are checked.

        I realize stable versions only allow bug fixing, and this
        patchset could
        arguably not be considered as so. But without this, it could be,
        although
        unlikely, a DPDK used in a machine with more than 1TB, and then
        NFP using
        the wrong DMA host addresses.

        Although virtual addresses used as iovas are more dangerous, for
        DPDK versions
        before 18.05 this is not worse than with physical addresses,
        because iovas,
        when physical addresses are not available, are based on a
        starting address set
        to 0x0.


    You might want to look at the following patch:

    http://patches.dpdk.org/patch/37149/
    <http://patches.dpdk.org/patch/37149/>

    Since this patch, IOVA as VA mode uses VA addresses, and that has
    been backported to earlier releases. I don't think there's any case
    where we used zero-based addresses any more.


But memsegs get the iova based on hugepages physaddr, and for VA mode
that is based on 0x0 as starting point.

And as far as I know, memsegs iovas are what end up being used for IOMMU
mappings and what devices will use.
For when physaddrs are available, IOVA as PA mode assigns IOVA addresses
to PA, while IOVA as VA mode assigns IOVA addresses to VA (both 18.05+ and
pre-18.05 as per above patch, which was applied to pre-18.05 stable
releases).

When physaddrs aren't available, IOVA as VA mode assigns IOVA addresses to
VA, both 18.05+ and pre-18.05, as per above patch.
This is right.

If physaddrs aren't available and IOVA as PA mode is used, then i as far
as i can remember, even though technically memsegs get their addresses set
to 0x0 onwards, the actual addresses we get in memzones etc. are
RTE_BAD_IOVA.
This is not right. Not sure if this was the intention, but if PA mode and
physaddrs not available, this code inside vfio_type1_dma_map:

                if (rte_eal_iova_mode() == RTE_IOVA_VA)

                        dma_map.iova = dma_map.vaddr;

                else

                        dma_map.iova = ms[i].iova;

does the IOMMU mapping using the iovas and not the vaddr, with the iovas
starting at 0x0.

Note that NFP PMD has not the RTE_PCI_DRV_IOVA_AS_VA flag, so this is
always the case when executing DPDK apps as non-root users.

I would say, if there is no such a flag, and then IOVA mode is PA, the
mapping should fail, as it occurs with 18.05.

I could send a patch for having this behaviour, but in that case I would
like to add that flag to NFP PMD and include the hugepage checking along
with changes to how iovas are obtained when mmaping, keeping the iovas
below the dma mask proposed.

quoted
      Since 18.05, those iovas can, and usually are, higher than 1TB, as
    they

        are based on 64 bits address space addresses, and by default the
        kernel uses a
        starting point far higher than 1TB.

        This patchset applies to stable 17.11.3 but I will be happy to
        submit patches, if
        required, for other DPDK stable versions.




    --     Thanks,
    Anatoly

--
Thanks,
Anatoly
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help