Thread (4 messages) 4 messages, 3 authors, 2007-05-31

Re: [PATCH] improved locking performance in rt_run_flush()

From: David Miller <davem@davemloft.net>
Date: 2007-05-31 23:26:23
Also in: lkml

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sun, 20 May 2007 15:11:48 +1000
David Miller [off-list ref] wrote:
quoted
From: Dave Johnson <redacted>
quoted
The below patch changes rt_run_flush() to only take each spinlock
protecting the rt_hash_table once instead of taking a spinlock for
every hash table bucket (and ending up taking the same small set 
of locks over and over).
...
quoted
I'm not ignoring it I'm just trying to brainstorm whether there
is a better way to resolve this inefficiency. :-)
The main problem I see with this is having to walk and free each
chain with the lock held.  We could avoid this if we had a pointer
in struct rtable to chain them up for freeing later.

I just checked and struct rtable is 236 bytes long on 32-bit but
the slab cache pads it to 256 bytes so we've got some free space.
I suspect 64-bit should be similar.
SLUB I believe packs more aggressively and won't pad things out like
that.  Therefore adding a member to rtable is much less attractive.

I've been considering various alternative ways to deal with this.

For 2.6.22 and -stable's sake we could allocate an array of pointers
of size N where N is the number of rtable hash slots per spinlock.
A big lock wraps around rt_run_flush() to protect these slots, and
then the loop is:

	grap_lock();
	for_each_hash_chain_for_lock(i) {
		rth = rt_hash_table[i].chain;
		if (rth) {
			rt_hash_table[i].chain = NULL;
			flush_chain[i % N] = rt;
		}
	}
	drop_lock();

	for (i = 0; i < N; i++) {
		struct rtable *rth = flush_chain[i];
		flush_chain[i] = NULL;
		while (rth) {
			struct rtable *next = rth->u.dst.rt_next;
			rt_free(rth);
			rth = next;
		}
	}

Holding a lock across the entire hash plucking has it's not nice
properties, but it's better than taking the same lock N times in
a row.

In the longer term, if I resurrect my dynamically sized rtable hash
patches (which I do intend to do), that code protected a lot of this
stuff with a seqlock and it might be possible to use that seqlock
solely to flush the lists in rt_run_flush().

Any better ideas?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help