Thread (35 messages) 35 messages, 9 authors, 2017-11-03

Re: [PATCH] igb_uio: revert open and release operations

From: Patil, Harish <hidden>
Date: 2017-10-19 22:43:13

-----Original Message-----
From: Harish Patil <redacted>
Date: Tuesday, October 17, 2017 at 9:50 PM
To: Thomas Monjalon <redacted>, Ferruh Yigit
[off-list ref]
Cc: "dev@dpdk.org" <redacted>, Jianfeng Tan <redacted>,
Jingjing Wu [off-list ref], "Thotton, Shijith"
[off-list ref], Gregory Etelson [off-list ref], George
Prekas [off-list ref], "stable@dpdk.org" [off-list ref]
Subject: Re: [PATCH] igb_uio: revert open and release operations

-----Original Message-----
From: Thomas Monjalon <redacted>
Date: Tuesday, October 17, 2017 at 1:33 PM
To: Ferruh Yigit <redacted>, Harish Patil
[off-list ref]
Cc: "dev@dpdk.org" <redacted>, Jianfeng Tan <redacted>,
Jingjing Wu [off-list ref], "Thotton, Shijith"
[off-list ref], Gregory Etelson [off-list ref], George
Prekas [off-list ref], "stable@dpdk.org" [off-list ref]
Subject: Re: [PATCH] igb_uio: revert open and release operations
quoted
17/10/2017 22:14, Ferruh Yigit:
quoted
There were bug reports about terminated application may leave device in
undesired state:
http://dpdk.org/ml/archives/dev/2016-November/049745.html
http://dpdk.org/ml/archives/dev/2016-November/050932.html

And a proposal to fix:
http://dpdk.org/ml/archives/dev/2016-December/051844.html

Later another proposal triggered the discussion:
http://dpdk.org/ml/archives/dev/2017-May/066317.html

Finally a fix patch pushed into v17.08:
Commit: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of
device file")

Later a regression report sent related to the pushed patch:
http://dpdk.org/ml/archives/dev/2017-September/075236.html

And a fix for regression integrated into v17.11-rc1:
http://dpdk.org/ml/archives/dev/2017-October/079166.html
Commit: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in
VM")
Commit: 6b9ed026a870 ("igb_uio: fix build with kernel <= 3.17")

Even after the fix qede PMD reported to be broken:
http://dpdk.org/ml/archives/dev/2017-October/079359.html

So this patch reverts original fix and related commits. The related
igb_uio code part turns back to v17.05 base.
[...]
quoted
---
It would be nice to solve this issue in LTS release, but being close to
the release and the error report without details makes it hard to work
more on this issue.
With this revert, we are back to the original issue.
We must really try the proposed solution.
Harish, please describe your issue and think how it could be fixed.
Jingjing made it work for i40e.
I know it is less effort to request a simple revert.
Please let's try to fix it once for all.
Hi Ferruh/Thomas,
I’m discussing it internally, so please hold on committing this patch till
I revert back to you.

Thanks.
[Harish]
1) With the introduction of:
Commit: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of
device file”)
   We saw failures with qede PF & SR-IOV VF initialization in PCI passthru
mode.
PF PCI passthru mode initialization failure was resolved by:
“Commit: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in
VM")
SR-IOV VF PCI passthru mode initialization issue is that PCI FLR and
related device cleanup is not completed by the time VF driver starts
loading. It results in the mbox command failure sent over the HW channel
between VF and PF.

Even though pci_reset_function() waits for the stipulated amount of time
per standards, VF FLR takes longer than that and pci_reset_function() &
igb_uio_open() call returns before FLR completes and VF PMD driver tries
to load before FLR completes leading to VF PMD initialization failure.


We can work around this problem by adding driver delay/retry logic since
there is no deterministic way of detecting FLR completion. But this is
going to increase the driver load time.
2) With the above patch ("igb_uio: issue FLR during open and release of
device file), FLR is going to be issued unconditionally on all devices
during igb_uio_open. We think it’s an over kill. FLR is required only for
previous abnormal app termination. We already handle the abnormal app
termination by doing necessary cleanup in the driver during load. This
cleanup is more efficient as it is done only when required. So we feel
that the drivers/devices needing such cleanup (the two cases listed
below)  should do it conditionally when required rather than
igb_uio_open() unconditionally performing FLR all the time.

e.g.,
   - cdb166963cae ("net/liquidio: add API for VF FLR”)

- http://dpdk.org/ml/archives/dev/2017-May/066317.html
Thanks,
Harish
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help