Thread (8 messages) 8 messages, 3 authors, 2004-08-30

Re: Libata VIA woes continue. Worked around - *wrong*

From: Jeff Garzik <hidden>
Date: 2004-08-29 09:04:49

Brad Campbell wrote:
Jeff Garzik wrote:
quoted
Well, there are some cases on a few controllers (SiI is one that comes 
to mind) where -- IIRC -- bridges dictate the max is UDMA/100, not 
UDMA/133, even if the underlying device is UDMA/133.

In sata_promise.c or sata_via.c, what happens if you change udma_mask 
from 0x7f to 0x3f?  Do the failures go away?

These drives are UDMA/100. On the VIA controller I changed the udma_mask 
to 0x1f and the failures "appeared" to go away but that was before I 
realised the exact nature of the failure mode. (That being it will 
either fail on bootup, or very soon after or it will work perfectly 
until the next boot)

I can always hook the drives up and hammer them if you'd like me to do 
further testing but I'm not sure how we can then let libata know that 
the drives connected need to be slowed down as we can't identify we have 
a bridge connected really.

I'm still not convinced that it's not something else.
Sure transfers > 200 sectors killed it on the VIA controller at UDMA/100 
while they appeared to work ok at UDMA/66. I guess I need to run a 
defined array of tests.

- Large transfers (> 200) at UDMA/100 and UDMA/66
- Small transfers (<=200) at UDMA/100 and UDMA/66
- Something like 10 reboot cycles of each.

It's very hard to hit on the Promise controller (Perhaps < 10% of 
reboots) while on the VIA controller it happens maybe 60% of the time.

And of course 2.6.5 never hits it at all. (And given I patched the VIA 
driver in 2.6.9-rc1 to keep transfers < 200 sectors and still hit the 
bug it's not that!)
Well, if you are completely unable to reproduce in 2.6.5, there are a 
couple things to try:

* copy drivers/scsi/libata*, drivers/scsi/sata_*, 
drivers/scsi/ata_piix.c, include/linux/libata.h, include/linux/ata.h 
from 2.6.9-rc1-bk into 2.6.5, and see if you can reproduce the failure. 
  (I can help if there are any compile/API problems you can't figure 
out)  That will eliminate non-libata changes at least.

* look at the changes from 2.6.5 -> 2.6.6 and see which change breaks 
things.  You can get a list of each change like this:

	bk changes -rv2.6.5..v2.6.6

then you can revert each patch in order, or bsearch.  Here's an example 
of reverting each libata patch in order:

bk clone http://linux.bkbits.net/linux-2.5 vanilla-2.6
bk clone -ql -rv2.6.6 vanilla-2.6 brad-test-2.6.6
cd brad-test-2.6.6
bk -r co -Sq
bk changes -rv2.6.5.. > /tmp/changes-list.txt
less /tmp/changes-list.txt	# scan for a libata-related change
bk cset -x1.1587.39.2		# applies reverse of cset 1.1587.39.2
make				# create test
				# ... test fails
bk cset -x1.1587.39.1		# applies reverse of cset 1.1587.39.1
				# _on top of_ previous reverted patch
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help