Thread (30 messages) 30 messages, 7 authors, 2017-08-31

Possible regression between 4.9 and 4.13

From: Mason <hidden>
Date: 2017-08-30 08:55:37
Also in: linux-pci

On 30/08/2017 08:02, Greg Kroah-Hartman wrote:
To get back to the original issue here, the hardware seems to have died,
the driver stops talking to it, and all is good.  The "regression" here
is that we now properly can determine that the hardware is crap.
Before 4.12, when I unplugged my USB3 Flash drive, Linux would
detect a few "Uncorrected Non-Fatal errors" via AER, but it was
still possible to plug the drive back in.

Since 4.12, once I unplug the drive, the whole USB3 card is marked
as dead (all 4 ports), and I can no longer plug anything in (not even
the USB2 drive that didn't have any issues, IIRC).

It seems a bit premature to "mark as dead" something that remains
functional, doesn't it?

Disclaimer, there are many variables in this setup, and I've only
tested a small fraction of the problem space: only one system,
only one USB3 board, only one USB3 Flash drive.
So, how do you think we should proceed, delay a bit longer before saying
the device is gone?  How long is "long enough"?  How many bus errors are
we allowed to tolerate (hint, the PCI spec says none...)

Maybe someone wants to get to the root problem here, why is the hardware
suddenly reporting all 1s?
I'm afraid I won't be able to make any progress on this front,
unless I can get my hands on a PCIe packet analyzer.

Regards.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help