Re: Problem with ata layer in 2.6.24
From: Gene Heskett <hidden>
Date: 2008-01-28 17:00:47
Also in:
lkml
Possibly related (same subject, not in this thread)
- 2008-01-29 · Re: Problem with ata layer in 2.6.24 · Gene Heskett <hidden>
- 2008-01-29 · Re: Problem with ata layer in 2.6.24 · Mark Lord <hidden>
- 2008-01-29 · Re: Problem with ata layer in 2.6.24 · Jeff Garzik <hidden>
- 2008-01-29 · Re: Problem with ata layer in 2.6.24 · Gene Heskett <hidden>
- 2008-01-29 · Re: Problem with ata layer in 2.6.24 · Jeff Garzik <hidden>
On Monday 28 January 2008, Gene Heskett wrote: While reading this msg as it came back, I locked up again and rebooted to 2.6.24, and got lucky (maybe) as the attached dmesg will show quite a few instances of this LOOOONNNGG before the nvidia driver is loaded to taint the kernel. Have fun guys!
On Monday 28 January 2008, Mikael Pettersson wrote:quoted
Gene Heskett writes:quoted
On Monday 28 January 2008, Peter Zijlstra wrote:quoted
On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote:quoted
1. Wrong mailing list; use linux-ide (@vger) instead.What, and keep all us other interested people in the dark?As a test, I tried rebooting to the latest fedora kernel and found it kills X, so I'm back to the second to last fedora version ATM, and the third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first two completed with no errors. I've added the linux-ide list to refresh those people of the problem, the logs are being spammed by this message stanza: Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28 04:46:25 coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28 04:46:25 coyote kernel: [26550.290060] ata1: soft resetting link Jan 28 04:46:25 coyote kernel: [26550.452301] ata1.00: configured for UDMA/100 Jan 28 04:46:25 coyote kernel: [26550.452318] ata1: EH complete Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote kernel: [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28 04:46:25 coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUAIt's not obvious from this incomplete dmesg log what HW or driver is behind ata1, but if the 2.6.24-rc7 kernel matches the 2.6.24 one, it should be pata_amd driving a WDC disk:quoted
[ 30.702887] pata_amd 0000:00:09.0: version 0.3.10 [ 30.703052] PCI: Setting latency timer of device 0000:00:09.0 to 64 [ 30.703188] scsi0 : pata_amd [ 30.709313] scsi1 : pata_amd [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14 [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15 [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0, 15.05R15, max UDMA/100 [ 30.864756] ata1.00: 390721968 sectors, multi 16: LBA48 [ 30.871629] ata1.00: configured for UDMA/100Unfortunately we also see:quoted
[ 48.285456] nvidia: module license 'NVIDIA' taints kernel. [ 48.549725] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC4] -> GSI 19 (level, high) -> IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007We have no way of debugging that module, so please try 2.6.24 without it.Sorry, I can't do this and have a working machine. The nv driver has suffered bit rot or something since the FC2 days when it COULD run a 19" crt at 1600x1200, and will not drive this 20" wide screen lcd 1680x1050 monitor at more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg compressed to 10%. The system is not usable on a day to basis without the nvidia driver. Fix the nv driver so it will run this screen at its native resolution and I'll be glad to run it even if it won't run google earth, which I do use from time to time. Now, if in all the hits you can get from google on this, currently 14,800 just for 'exception Emask', apparently caused by a timeout, if 100% of the complainers are running nvidia drivers also, then I see a legit complaint. Again, fix the nv driver so it will run my screen & I'll be glad to switch. I can see the reason, sure, but the machine must be capable of doing its common day to day stuff, while using that driver, like running kde for kmail, and browsers that work.quoted
If the problems persist, please try to capture a complete log from the failing kernel -- the interesting bits are everything from initial boot up to and including the first few errors. You may need to increase the kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT).If by log you mean /var/log/messages, I have several megabytes of those. If you mean a live dmesg capture taken right now, its attached. It contains several of these at the bottom. I long ago made the kernel log buffer bigger, cuz it couldn't even show the start immediately after the boot, and even the dump to syslog was truncated.quoted
There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final.That is what I was afraid of. I've done some limited grepping in that branch of the kernel tree, and cannot seem to locate where this EH handler is being invoked from. There is 2 lines of interest in the dmesg: [ 0.000000] Nvidia board detected. Ignoring ACPI timer override. [ 0.000000] If you got timer trouble try acpi_use_timer_override But I have NDI what it means, kernel argument/xconfig option? I've also done some googling, and it appears this problem is fairly widespread since the switchover to libata was encouraged. A stock fedora F8 kernel suffers the same freezes and eventually locks up, but does it without the error messages being logged, it just freezes, feeling identical to this in the minutes before the total freeze. I've tried 2 of those too, but the newest one won't even run X.
-- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Deprive a mirror of its silver and even the Czar won't see his face.