Possible regression between 4.9 and 4.13
From: gregkh@linuxfoundation.org (Greg Kroah-Hartman)
Date: 2017-08-30 09:22:01
Also in:
linux-pci
On Wed, Aug 30, 2017 at 10:07:59AM +0100, Ard Biesheuvel wrote:
On 30 August 2017 at 09:55, Mason [off-list ref] wrote:quoted
On 30/08/2017 08:02, Greg Kroah-Hartman wrote:quoted
To get back to the original issue here, the hardware seems to have died, the driver stops talking to it, and all is good. The "regression" here is that we now properly can determine that the hardware is crap.Before 4.12, when I unplugged my USB3 Flash drive, Linux would detect a few "Uncorrected Non-Fatal errors" via AER, but it was still possible to plug the drive back in. Since 4.12, once I unplug the drive, the whole USB3 card is marked as dead (all 4 ports), and I can no longer plug anything in (not even the USB2 drive that didn't have any issues, IIRC). It seems a bit premature to "mark as dead" something that remains functional, doesn't it? Disclaimer, there are many variables in this setup, and I've only tested a small fraction of the problem space: only one system, only one USB3 board, only one USB3 Flash drive.Please don't forget to mention that this is quirky hardware that depends on BROKEN because it multiplexes MMIO and config space accesses in the same memory window without any locking whatsoever (which would be difficult to do in the first place because we don't use accessors for MMIO in the kernel). So how likely is it that you are attempting to read from the xhci BAR window while a config space access is in progress? Any way to instrument this in your driver?
Seriously? Ok, that's crap hardware, sorry, I don't feel bad at all here. You are going to have worse problems than just a single USB controller issue if that's your hardware design, go kick some hardware engineers for me please. good luck, you are on your own :( greg k-h