Re: Kernel Crash bug of ixgbevf kernel module in "Intel(R) 10GbE PCI Express Virtual Function Driver Version: 4.0.3 Release: 1"
From: Sam <hidden>
Date: 2018-01-31 08:59:14
I don't know if it's because ixgbevf driver call "ixgbevf_close()" by "dev_close()" without rtnl_lock(). Refer to log "bond1: Releasing active interface enp1s16f1", bond is releasing its slaves, then call "dev_close()" function without rtnl_lock(). At the same time, "ixgbevf_service_task()" loop to "ixgbevf_reinit_locked()" function. Then two thread operate same address, then bug happens. 2018-01-31 15:44 GMT+08:00 Sam [off-list ref]:
wrong order, actually is this: 4. stop bond1(ifdown bond1), stop dpdkb2(rte_eth_dev_stop), sleep 5 second, start dpdkb2(rte_eth_dev_start), start bond1(ifup bond1). 2018-01-31 15:34 GMT+08:00 Sam [off-list ref]:quoted
Hi all, There is a kernel crash bug of ixgbevf kernel module in "Intel(R) 10GbE PCI Express Virtual Function Driver Version: 4.0.3 Release: 1" How to produce: 1. Use SRIOV, like this: sudo /usr/local/share/openvswitch/scripts/dpdk_nic_bind --status Network devices using DPDK-compatible driver ============================================ 0000:01:00.0 'Ethernet Controller 10-Gigabit X540-AT2' drv=igb_uio unused=ixgbe 0000:01:00.1 'Ethernet Controller 10-Gigabit X540-AT2' drv=igb_uio unused=ixgbe Network devices using kernel driver =================================== 0000:01:10.0 'X540 Ethernet Controller Virtual Function' if=enp1s16 drv=ixgbevf unused=bak,igb_uio 0000:01:10.1 'X540 Ethernet Controller Virtual Function' if=enp1s16f1 drv=ixgbevf unused=bak,igb_uio 0000:08:00.0 'I350 Gigabit Network Connection' if=eth2 drv=igb unused=igb_uio 0000:08:00.1 'I350 Gigabit Network Connection' if=eth3 drv=igb unused=igb_uio Other network devices ===================== <none> 2. bond enp1s16 and enp1s16f1 into bond1, by /etc/sysconfig/ifcfg-bond1. 3. bond 0000:01:00.0 and 0000:01:00.1 in ovs-dpdk into dpdkb2, by dpdk api. 4. stop dpdkb2(rte_eth_dev_stop), stop bond1(ifdown bond1), sleep 5 second, start bond1(ifup bond1), start dpdkb2(rte_eth_dev_start). After several times, bug happens, attachment is vmcore-dmesg.txt