Thread (25 messages) 25 messages, 6 authors, 2018-12-27

Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

From: <hidden>
Date: 2018-11-08 22:49:16
Also in: linux-pci, lkml

Possibly related (same subject, not in this thread)

On 11/08/2018 04:43 PM, Greg Kroah-Hartman wrote:
[EXTERNAL EMAIL]
Please report any suspicious attachments, links, or requests for sensitive information.


On Thu, Nov 08, 2018 at 03:32:58PM -0700, Keith Busch wrote:
quoted
On Thu, Nov 08, 2018 at 02:01:17PM -0800, Greg Kroah-Hartman wrote:
quoted
On Thu, Nov 08, 2018 at 02:09:17PM -0600, Bjorn Helgaas wrote:
quoted
I'm having second thoughts about this.  One thing I'm uncomfortable
with is that sprinkling pci_dev_is_disconnected() around feels ad hoc
instead of systematic, in the sense that I don't know how we convince
ourselves that this (and only this) is the correct place to put it.
I think my stance always has been that this call is not good at all
because once you call it you never really know if it is still true as
the device could have been removed right afterward.

So almost any code that relies on it is broken, there is no locking and
it can and will race and you will loose.
AIUI, we're not trying to create code to rely on this. This more about
reducing reliance on hardware. If the software misses the race once and
accesses disconnected device memory, that's usually not a big deal to
let hardware sort it out, but the point is not to push our luck.
Then why even care about this call at all?  If you need to really know
if the read worked, you have to check the value.  If the value is FF
then you have a huge hint that the hardware is now gone.  And you can
rely on it being gone, you can never rely on making the call to the
function to check if the hardware is there to be still valid any point
in time after the call returns.
In the case that we're trying to fix, this code executing is a result of 
the device being gone, so we can guarantee race-free operation. I agree 
that there is a race, in the general case. As far as checking the result 
for all F's, that's not an option when firmware crashes the system as a 
result of the mmio read/write. It's never pretty when firmware gets 
involved.

Alex
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help