Re: [PATCH] improved locking performance in rt_run_flush()
From: David Miller <davem@davemloft.net>
Date: 2007-05-31 23:26:23
Also in:
lkml
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Sun, 20 May 2007 15:11:48 +1000
David Miller [off-list ref] wrote:quoted
From: Dave Johnson <redacted>quoted
The below patch changes rt_run_flush() to only take each spinlock protecting the rt_hash_table once instead of taking a spinlock for every hash table bucket (and ending up taking the same small set of locks over and over)....quoted
I'm not ignoring it I'm just trying to brainstorm whether there is a better way to resolve this inefficiency. :-)The main problem I see with this is having to walk and free each chain with the lock held. We could avoid this if we had a pointer in struct rtable to chain them up for freeing later. I just checked and struct rtable is 236 bytes long on 32-bit but the slab cache pads it to 256 bytes so we've got some free space. I suspect 64-bit should be similar.
SLUB I believe packs more aggressively and won't pad things out like
that. Therefore adding a member to rtable is much less attractive.
I've been considering various alternative ways to deal with this.
For 2.6.22 and -stable's sake we could allocate an array of pointers
of size N where N is the number of rtable hash slots per spinlock.
A big lock wraps around rt_run_flush() to protect these slots, and
then the loop is:
grap_lock();
for_each_hash_chain_for_lock(i) {
rth = rt_hash_table[i].chain;
if (rth) {
rt_hash_table[i].chain = NULL;
flush_chain[i % N] = rt;
}
}
drop_lock();
for (i = 0; i < N; i++) {
struct rtable *rth = flush_chain[i];
flush_chain[i] = NULL;
while (rth) {
struct rtable *next = rth->u.dst.rt_next;
rt_free(rth);
rth = next;
}
}
Holding a lock across the entire hash plucking has it's not nice
properties, but it's better than taking the same lock N times in
a row.
In the longer term, if I resurrect my dynamically sized rtable hash
patches (which I do intend to do), that code protected a lot of this
stuff with a seqlock and it might be possible to use that seqlock
solely to flush the lists in rt_run_flush().
Any better ideas?