Thread (33 messages) 33 messages, 4 authors, 2019-01-21

Re: IXGBE, IOMMU DMAR DRHD handling fault issue

From: Ravi Kerur <hidden>
Date: 2018-01-29 22:35:18

Hi Burakov,

When using vfio-pci on host both VF and PF interfaces works fine with dpdk
i.e. I don't see DMAR fault messages anymore. However, when I attach a VF
interface to a VM and start DPDK with vfio-pci inside VM I still see DMAR
fault messages on host. Both host and VM are booted with 'intel-iommu=on'
on GRUB. Ping from VM with DPDK/vfio-pci doesn't work (I think it's
expected because of DMAR faults), however, when VF interface uses ixgbevf
driver ping works.

Following are some details

/*****************On VM***************/
dpdk-devbind -s

Network devices using DPDK-compatible driver
============================================
0000:00:07.0 '82599 Ethernet Controller Virtual Function' drv=vfio-pci
unused=ixgbevf

Network devices using kernel driver
===================================
0000:03:00.0 'Device 1041' if=eth0 drv=virtio-pci unused=vfio-pci *Active*
0000:04:00.0 'Device 1041' if=eth1 drv=virtio-pci unused=vfio-pci
0000:05:00.0 'Device 1041' if=eth2 drv=virtio-pci unused=vfio-pci

Other network devices
=====================
<none>

Crypto devices using DPDK-compatible driver
===========================================
<none>

Crypto devices using kernel driver
==================================
<none>

Other crypto devices
====================
<none>


00:07.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller
Virtual Function (rev 01)
        Subsystem: Intel Corporation 82599 Ethernet Controller Virtual
Function
        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
        Region 0: Memory at fda00000 (64-bit, prefetchable) [size=16K]
        Region 3: Memory at fda04000 (64-bit, prefetchable) [size=16K]
        Capabilities: [70] MSI-X: Enable+ Count=3 Masked-
                Vector table: BAR=3 offset=00000000
                PBA: BAR=3 offset=00002000
        Capabilities: [a0] Express (v1) Root Complex Integrated Endpoint,
MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0
                        ExtTag- RBE-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr-
TransPend-
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
                AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap-
ChkEn-
        Kernel driver in use: vfio-pci
        Kernel modules: ixgbevf

/***************on Host*************/
dmesg | grep DMAR
...
[  978.268143] DMAR: DRHD: handling fault status reg 2
[  978.268147] DMAR: [DMA Read] *Request device [04:10.0]* fault addr
33a128000 [fault reason 06] PTE Read access is not set
[ 1286.677726] DMAR: DRHD: handling fault status reg 102
[ 1286.677730] DMAR: [DMA Read] Request device [04:10.0] fault addr
fb663000 [fault reason 06] PTE Read access is not set
[ 1676.436145] DMAR: DRHD: handling fault status reg 202
[ 1676.436149] DMAR: [DMA Read] Request device [04:10.0] fault addr
33a128000 [fault reason 06] PTE Read access is not set
[ 1734.433649] DMAR: DRHD: handling fault status reg 302
[ 1734.433652] DMAR: [DMA Read] Request device [04:10.0] fault addr
33a128000 [fault reason 06] PTE Read access is not set
[ 2324.428938] DMAR: DRHD: handling fault status reg 402
[ 2324.428942] DMAR: [DMA Read] Request device [04:10.0] fault addr
7770c000 [fault reason 06] PTE Read access is not set
[ 2388.553640] DMAR: DRHD: handling fault status reg 502
[ 2388.553643] DMAR: [DMA Read] *Request device [04:10.0]* fault addr
33a128000 [fault reason 06] PTE Read access is not set


VM is started with

qemu-system-x86_64 -enable-kvm -M q35,accel=kvm,kernel-irqchip=split
-object iothread,id=iothread0 -device
intel-iommu,intremap=on,device-iotlb=on,caching-mode=on -cpu host
-daemonize -m 16G -smp 14 -uuid 0fc91c66-f0b1-11e7-acf4-525400123456 -name
212748-sriov-ravi-smac-alpha-SMAC10 -device ioh3420,id=root.1,chassis=1
-device ioh3420,id=root.2,chassis=2 -netdev
tap,vhost=on,queues=2,ifname=vn-vn2_1_,downscript=no,id=vn-vn2_1_,script=no
-device ioh3420,id=root.3,chassis=3 -device
virtio-net-pci,netdev=vn-vn2_1_,bus=root.3,ats=on,mq=on,vectors=6,mac=DE:AD:02:88:10:37,id=vn-vn2_1__dev
-netdev
tap,vhost=on,queues=2,ifname=vn-vn92_1_,downscript=no,id=vn-vn92_1_,script=no
-device ioh3420,id=root.4,chassis=4 -device
virtio-net-pci,mac=DE:AD:02:88:10:38,netdev=vn-vn92_1_,bus=root.4,ats=on,mq=on,vectors=6,id=vn-vn92_1__dev
-netdev
tap,vhost=on,queues=2,ifname=vn-vn93_1_,downscript=no,id=vn-vn93_1_,script=no
-device ioh3420,id=root.5,chassis=5 -device
virtio-net-pci,mac=DE:AD:02:88:10:39,netdev=vn-vn93_1_,bus=root.5,ats=on,mq=on,vectors=6,id=vn-vn93_1__dev
-vnc :16,websocket=15916 -qmp tcp:127.0.0.1:12001,server,nowait -chardev
socket,id=charmonitor,path=/tmp/mon.12001,server,nowait -mon
chardev=charmonitor,id=monitor -cdrom
/var/venom/cloud_init/0fc91c66-f0b1-11e7-acf4-525400123456.iso -*device
vfio-pci,host=04:10.0* -drive
file=/var/venom/instance_repo/test.img,if=none,id=drive-virtio-disk0,format=raw,aio=native,cache=none
-balloon none -device
virtio-blk-pci,scsi=off,iothread=iothread0,drive=drive-virtio-disk0,id=virtio-disk0,bus=root.1,ats=on,bootindex=1

Thanks.


On Thu, Jan 25, 2018 at 2:49 AM, Burakov, Anatoly <anatoly.burakov@intel.com
wrote:
On 24-Jan-18 7:13 PM, Ravi Kerur wrote:
quoted
Hi Burakov, Thank you. I will try with vfio-pci driver. I am assuming it
will work for both PF and VF interfaces since I am using both in my setup?

Thanks.
Yes, it should work for both PF and VF devices.

quoted
On Wed, Jan 24, 2018 at 2:31 AM, Burakov, Anatoly <
anatoly.burakov@intel.com <mailto:anatoly.burakov@intel.com>> wrote:

    On 23-Jan-18 5:25 PM, Ravi Kerur wrote:

        Hi,

        I am running into an issue when DPDK is started with iommu on
        via GRUB
        command. Problem is not seen with regular kernel driver, error
        messages
        show when DPDK is started and happens for both PF and VF
interfaces.

        I am using DPDK 17.05 so the patch proposed in the following link
is
        available
        http://dpdk.org/ml/archives/dev/2017-February/057048.html
        <http://dpdk.org/ml/archives/dev/2017-February/057048.html>

        Workaround is to use "iommu=pt" but I want iommu enabled in my
        setup. I
        checked BIOS for reserved memory(DMA RMRR for IXGBE) didn't get
        any details
        on it.

        Kindly let me know how to resolve this issue.

        Following are the details

        (1) Linux kernel 4.9
        (2) DPDK 17.05

        (3) IXGBE details
        ethtool -i enp4s0f0  (PF driver)
        driver: ixgbe
        version: 5.3.3
        firmware-version: 0x800007b8, 1.1018.0
        bus-info: 0000:04:00.0
        supports-statistics: yes
        supports-test: yes
        supports-eeprom-access: yes
        supports-register-dump: yes
        supports-priv-flags: yes

        ethtool -i enp4s16f2 (VF driver)
        driver: ixgbevf
        version: 4.3.2
        firmware-version:
        bus-info: 0000:04:10.2
        supports-statistics: yes
        supports-test: yes
        supports-eeprom-access: no
        supports-register-dump: yes
        supports-priv-flags: no

        Bus info          Device       Class          Description
        =========================================================
        pci@0000:01:00.0  ens11f0      network        82599ES 10-Gigabit
        SFI/SFP+
        Network Connection
        pci@0000:01:00.1  ens11f1      network        82599ES 10-Gigabit
        SFI/SFP+
        Network Connection
        pci@0000:04:00.0  enp4s0f0     network        82599ES 10-Gigabit
        SFI/SFP+
        Network Connection
        pci@0000:04:00.1  enp4s0f1     network        82599ES 10-Gigabit
        SFI/SFP+
        Network Connection
        pci@0000:04:10.0  enp4s16      network        Illegal Vendor ID
        pci@0000:04:10.2  enp4s16f2    network        Illegal Vendor ID

        (4) DPDK bind interfaces

        # dpdk-devbind -s

        Network devices using DPDK-compatible driver
        ============================================
        0000:01:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
        drv=igb_uio unused=vfio-pci
        0000:04:10.2 '82599 Ethernet Controller Virtual Function 10ed'
        drv=igb_uio
        unused=vfio-pci

        Network devices using kernel driver
        ===================================
        0000:01:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
        if=ens11f1 drv=ixgbe unused=igb_uio,vfio-pci
        0000:04:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
        if=enp4s0f0 drv=ixgbe unused=igb_uio,vfio-pci
        0000:04:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
        if=enp4s0f1 drv=ixgbe unused=igb_uio,vfio-pci
        0000:04:10.0 '82599 Ethernet Controller Virtual Function 10ed'
        if=enp4s16
        drv=ixgbevf unused=igb_uio,vfio-pci
        0000:06:00.0 'I210 Gigabit Network Connection 1533' if=eno1
drv=igb
        unused=igb_uio,vfio-pci *Active*

        Other Network devices
        =====================
        <none>

        ...

        (5) Kernel dmesg

        # dmesg | grep -e DMAR
        [    0.000000] ACPI: DMAR 0x000000007999BAD0 0000E0 (v01 ALASKA
        A M I
        00000001 INTL 20091013)
        [    0.000000] DMAR: IOMMU enabled
        [    0.518747] DMAR: Host address width 46
        [    0.526616] DMAR: DRHD base: 0x000000fbffc000 flags: 0x0
        [    0.537447] DMAR: dmar0: reg_base_addr fbffc000 ver 1:0 cap
        d2078c106f0466 ecap f020df
        [    0.553620] DMAR: DRHD base: 0x000000c7ffc000 flags: 0x1
        [    0.564445] DMAR: dmar1: reg_base_addr c7ffc000 ver 1:0 cap
        d2078c106f0466 ecap f020df
        [    0.580611] DMAR: RMRR base: 0x0000007bbc6000 end:
        0x0000007bbd4fff
        [    0.593344] DMAR: ATSR flags: 0x0
        [    0.600178] DMAR: RHSA base: 0x000000c7ffc000 proximity
        domain: 0x0
        [    0.612905] DMAR: RHSA base: 0x000000fbffc000 proximity
        domain: 0x1
        [    0.625632] DMAR-IR: IOAPIC id 3 under DRHD base  0xfbffc000
        IOMMU 0
        [    0.638522] DMAR-IR: IOAPIC id 1 under DRHD base  0xc7ffc000
        IOMMU 1
        [    0.651426] DMAR-IR: IOAPIC id 2 under DRHD base  0xc7ffc000
        IOMMU 1
        [    0.664324] DMAR-IR: HPET id 0 under DRHD base 0xc7ffc000
        [    0.675326] DMAR-IR: Queued invalidation will be enabled to
        support
        x2apic and Intr-remapping.
        [    0.693805] DMAR-IR: Enabled IRQ remapping in x2apic mode
        [    9.395170] DMAR: dmar1: Using Queued invalidation
        [    9.405011] DMAR: Setting RMRR:
        [    9.412006] DMAR: Setting identity map for device 0000:00:1d.0
        [0x7bbc6000 - 0x7bbd4fff]
        [    9.428569] DMAR: Prepare 0-16MiB unity mapping for LPC
        [    9.439712] DMAR: Setting identity map for device
        0000:00:1f.0 [0x0 -
        0xffffff]
        [    9.454684] DMAR: Intel(R) Virtualization Technology for
        Directed I/O
        [  287.023068] DMAR: DRHD: handling fault status reg 2
        [  287.023073] DMAR: [DMA Read] Request device [01:00.0] fault
addr
        18a260a000 [fault reason 06] PTE Read access is not set
        [  287.023180] DMAR: DRHD: handling fault status reg 102
        [  287.023183] DMAR: [DMA Read] Request device [01:00.0] fault
addr
        18a3010000 [fault reason 06] PTE Read access is not set
        [  287.038250] DMAR: DRHD: handling fault status reg 202
        [  287.038252] DMAR: [DMA Read] Request device [01:00.0] fault
addr
        18a3010000 [fault reason 06] PTE Read access is not set
        [  288.170165] DMAR: DRHD: handling fault status reg 302
        [  288.170170] DMAR: [DMA Read] Request device [01:00.0] fault
addr
        1890754000 [fault reason 06] PTE Read access is not set
        [  288.694496] DMAR: DRHD: handling fault status reg 402
        [  288.694499] DMAR: [DMA Read] Request device [04:10.2] fault
addr
        189069c000 [fault reason 06] PTE Read access is not set
        [  289.927113] DMAR: DRHD: handling fault status reg 502
        [  289.927116] DMAR: [DMA Read] Request device [01:00.0] fault
addr
        1890754000 [fault reason 06] PTE Read access is not set
        [  290.174275] DMAR: DRHD: handling fault status reg 602
        [  290.174279] DMAR: [DMA Read] Request device [01:00.0] fault
addr
        1890754000 [fault reason 06] PTE Read access is not set
        [  292.174247] DMAR: DRHD: handling fault status reg 702
        [  292.174251] DMAR: [DMA Read] Request device [01:00.0] fault
addr
        1890754000 [fault reason 06] PTE Read access is not set
        [  294.174227] DMAR: DRHD: handling fault status reg 2
        [  294.174230] DMAR: [DMA Read] Request device [01:00.0] fault
addr
        1890754000 [fault reason 06] PTE Read access is not set
        [  296.174216] DMAR: DRHD: handling fault status reg 102
        [  296.174219] DMAR: [DMA Read] Request device [01:00.0] fault
addr
        1890754000 [fault reason 06] PTE Read access is not set
        [root@infradev-comp006.naw02.infradev.viasat.io
        <mailto:root@infradev-comp006.naw02.infradev.viasat.io> ~]
        #

        Thanks.


    Hi Ravi,

    The "iommu=pt" workaround applies only when you want to use igb_uio
    driver. VFIO driver is able to fully utilize IOMMU without the need
    for pass-through mode. From your log i can see that some devices are
    bound to igb_uio while others are bound to vfio-pci. Just bind all
    of the devices you want to use with DPDK to vfio-pci and these
    errors should go away.

    --     Thanks,
    Anatoly

--
Thanks,
Anatoly
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help