Thread (11 messages) 11 messages, 3 authors, 2011-08-04

Re: ahci_start_engine compliance with AHCI spec

From: Tejun Heo <tj@kernel.org>
Date: 2011-07-22 09:03:24
Also in: lkml

Hello, Brian.

On Thu, Jul 21, 2011 at 10:13:16AM -0700, Brian Norris wrote:
On Thu, Jul 21, 2011 at 1:49 AM, Tejun Heo [off-list ref] wrote:
quoted
On Mon, Jul 18, 2011 at 11:40:17AM -0700, Brian Norris wrote:
quoted
On Wed, Jul 13, 2011 at 6:14 AM, Tejun Heo [off-list ref] wrote:
quoted
Hmmm... what happens if you don't comment out ahci_start_engine() call
from ahci_start_port()?
I wasn't commenting out the ahci_start_engine() from
ahci_start_port(). Can you clarify what you mean?
Oh, I meant "what if you comment out..."  I wrote that sentence in
negative and then switched but forgot removing "don't".
OK, well I tried simply commenting out that ahci_start_engine() on
both my special controller and on the Dell E6410 laptop and it worked
just fine (solved my issues and didn't cause any issues on the Dell).
Is this safe? It seems like we end up calling ahci_start_engine() at
the end of the error handling process anyway, so maybe this call is
not really necessary in the first place?
Yes, I believe so.
Anyway, I also tried my own fix for this: adding a small delay to wait
for some link recognition at the end of ahci_power_up(). I'm not sure
if this is the greatest, but it also works for both systems I'm
testing. I included the test patch here (based on linux-2.6). BTW, I'm
not sure my mail will be formatted perfectly here. I can resend with
my other mailer if needed.
The problem is that both my and your approach aren't ultimately safe
on this particular IP block.  I don't think it's possible make things
completely safe for it.  There's no mutual exclusion against PHY
events - be it flaky signal, power surge or actual hotplug - and
driver operation.  No matter how careful the driver behaves, if PHY
events happen after the last check before starting DMA engine, DRQ may
be set by the time driver gets to it.

The IP block you're dealing with is inherently buggy.  What the spec
means, I think, is the DMA engine might not start or behave properly
if enabled while DRQ is set, which is fine.  Driver will notice that,
reset stuff and retry.  It is *completely* different from "the
controller becomes brick until power cycled if that happens".  So, we
can work around all we want but that is one buggy controller.  If
possible, please tell the manufacturer or licensor to fix it.

For now, let's first try removing ahci_start_engine() call from
port_start and see how that goes.

Thanks.

-- 
tejun
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help