Thread (15 messages) 15 messages, 6 authors, 2005-10-17

Re: SATA150TX4 atat1:command timeout

From: Francois Payette <hidden>
Date: 2005-02-16 15:03:37

With plain vanilla 2.6.11-rc4 the same bug appears after about 250GB 
(avg of 2 trials). With the TBG clock setting line omitted it still 
happens, but after about 1 1 TB (avg of 2 trials, takes about 6hrs per 
trial). Interestingly enough, this change does not slow down the setup, 
it even seems a little faster.

I was mistaken earlier: the 4 drives are not exactly the same, there is 
2 6B200M0 one 6B200S0 and one 6Y200M0. This should be irrelevant as I 
have swapped disks and wires and the problem happens anyway. One 
interesting thing: in init 1 the timeout seems to appear faster, after 
about 200GB in the case with the omission. I would be inclined to think 
this is some sort of a deadlock or race condition: the kernel does not 
dump or panic, it just hangs on pdc_eng_timeout. When we dumped the 
stack  in that function, all we had was pdc_eng_timeout, as there seems 
to a be a separate thread per disk that gets waken up for error handling.

Any ideas on how we can catch this one?
TIA,
Francois
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help