Re: BUG in 2.6.20-rt8
From: Paul E. McKenney <hidden>
Date: 2007-02-26 01:52:24
Also in:
lkml
On Sat, Feb 24, 2007 at 10:37:44PM -0800, David Miller wrote:
From: Ingo Molnar <redacted> Date: Sun, 25 Feb 2007 07:27:47 +0100quoted
* Paul E. McKenney [off-list ref] wrote:quoted
I got the following running stock 2.6.20-rt8 on an 4-CPU 1.8GHz Opteron box. The machine continued to run a few rounds of kernbench and LTP. Looks a bit scary -- a tasklet was "stolen" from __tasklet_action(). Thoughts? In the meantime, kicking it off again to see if it repeats.quoted
BUG: at kernel/softirq.c:559 __tasklet_action()this seems to happen very sporadically. Seems to happen more likely on hyperthreading CPUs. It is very likely caused by the redesign-tasklet-locking-to-be-sane patch below - which is a quick hack of mine from early -rt days. Can you see any obvious bug in it? The cmpxchg logic is certainly a bit ... tricky, locking-wise.Ingo, please don't use cmpxchg() in generic code, we support several processors that simply cannot do it.
OK, I will bite... Why doesn't the traditional hash table of locks work here? Use the cache-line address as input to the hash function, take the corresponding lock, do the compare-and-exchange by hand, and then release the lock. What am I missing here? Address aliasing do to memory being mapped into multiple locations or something? (In that case, use only the portion of the address within the page, right?) I will agree that cmpxchg() has been abused pretty thoroughly in some venues, but it does have legitimate uses. Thanx, Paul
Instead of saying "it's just something special in -rt for now", take it out now so that what you do eventually push upstream does get tested.