Thread (10 messages) 10 messages, 3 authors, 2007-01-18

Re: watchdog timeout panic in e1000 driver

From: Kenzo Iwami <hidden>
Date: 2007-01-16 08:42:55

Hi,

Thank you for your comment.
thanks for staying patient while most of us were out or busy. Apart from acknowledging 
that you might have fixed a problem with your patch, we're very reluctant to merge such 
a huge change in our driver that touches much more cases then the one that seems to be 
giving you problems.

I've thought up a much more elegant solution that prevents the driver from asserting the 
swfw semaphore during normal operations by checking the mac LU (link up) register in the 
watchdog. This allows the watchdog task to bypass all PHY checking in case all link 
statuses are OK, and thus removes the big problem that you are seeing.

Attached a version that should apply against most current trees. Please give it a try 
and let us know if this also fixes the problem for you. I will most likely push this 
patch to the netdev tree in any case.
I tried your patch. Unfortunately, the system still panicked with the
same symptom.

In your patch, e1000_update_stats() is still called by e1000_watchdog().
And, e1000_update_stats() calls e1000_read_phy_reg().
Therefore, interrupt handler tires to acquire the semaphore.
As a result, the same problem still occurs.

To fix this problem, interrupt handler must not call e1000_read_phy_reg()
while the interrupted code is holding the semaphore.

My patch may seem like a huge change, but in essence the change is
pretty simple.

In my patch, the interrupt handler code will check whether the interrupted
code is holding the swfw semaphore. If it is held, the watchdog function
is deferred until swfw semaphore is released.
The modification is for the interrupted code which is holding the
semaphore, and the interrupt handler, so they are both directly related
to this problem.

I will try to add some comments to my code to make it more readable.
--
  Kenzo Iwami (k-iwami@cj.jp.nec.com)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help