Re: BUG: soft lockup detected on CPU#0! (2.6.18.2 plus hacks)
From: Ben Greear <hidden>
Date: 2007-01-04 01:03:17
Herbert Xu wrote:
David Stevens [off-list ref] wrote:quoted
Ben, Here's a patch that I think will fix it, assuming the receive is on the same device as the initialization. Can you try this out?Hi David: Your patch makes sense on its own but I don't see the direct connection to the soft lock-up. Sure it prevents the code path in question from triggering. However, if we don't understand why it's locking up in the first place then this may just be hiding it rather than fixing it. In particular, a soft lockup means that we're doing so much work in the softirq handlers that processes are not getting run. So what is it exactly here that's causing us to get stuck in the softirq handlers? Is it because we're somehow getting stuck in a net rx loop?
I'm not sure if it helps..but I did notice that 'ip' was using 99% of the CPU on the system. Could this be because it was spinning trying to acquire the read-lock? When I ran 'ifconfig -a', that process hung, and at that point the system was rebooted. Before I ran ifconfig, 'top' and 'ls' and similar apps were responding fine, and I was logged in over ssh from the US to Australia, so it's basic networking was functioning. What if the race is that the read-lock is only half initialized, so that it doesn't trigger the uninitialized-lock-use debug message, but still screws up and will not ever let the reader acquire the lock? Thanks, Ben
Cheers,
-- Ben Greear [off-list ref] Candela Technologies Inc http://www.candelatech.com