Thread (215 messages) 215 messages, 22 authors, 2009-05-01

Re: [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3)

From: Eric Dumazet <hidden>
Date: 2009-04-16 06:28:19
Also in: lkml, netfilter-devel

David Miller a écrit :
From: Eric Dumazet <redacted>
Date: Wed, 15 Apr 2009 23:07:29 +0200
quoted
Well, it seems original patch was not so bad after all

http://lists.netfilter.org/pipermail/netfilter-devel/2006-January/023175.html

So change per-cpu spinlocks to per-cpu rwlocks 

and use read_lock() in ipt_do_table() to allow recursion...
Grumble, one more barrier to getting rid of rwlocks in the whole
tree. :-/

I really think we should entertain the idea where we don't RCU quiesce
when adding rules.  That was dismissed as not workable because the new
rule must be "visible" as soon as we return to userspace but let's get
real, effectively it will be.
We had to RCU quiesce to be sure old rules were not any more used before
freeing them. Alternative is to defer freeing via call_rcu() but
subject to OOM.

With 200 basic rules, size of rules table is about 40960 bytes per cpu.
(88 pages taken on vmalloc virtual space on my 8 cpus machine)
0xfcaf8000-0xfcb03000   45056 xt_alloc_table_info+0xa8/0xd0 pages=10 vmalloc
0xfcb04000-0xfcb0f000   45056 xt_alloc_table_info+0xa8/0xd0 pages=10 vmalloc
0xfcb10000-0xfcb1b000   45056 xt_alloc_table_info+0xa8/0xd0 pages=10 vmalloc
0xfcb1c000-0xfcb27000   45056 xt_alloc_table_info+0xa8/0xd0 pages=10 vmalloc
0xfcb28000-0xfcb33000   45056 xt_alloc_table_info+0xa8/0xd0 pages=10 vmalloc
0xfcb34000-0xfcb3f000   45056 xt_alloc_table_info+0xa8/0xd0 pages=10 vmalloc
0xfcb40000-0xfcb4b000   45056 xt_alloc_table_info+0xa8/0xd0 pages=10 vmalloc
0xfcb4c000-0xfcb57000   45056 xt_alloc_table_info+0xa8/0xd0 pages=10 vmalloc

This kind of monolithic huge object is hard to handle with RCU semantic,
more suitable for handling set of small objects (struct file for example),
even if RCU can have a backoff of 10000 elements in its queue...
If there are any stale object reference issues, we can use RCU object
destruction to handle that kind of thing.

I almost cringed when the per-spinlock idea was proposed, but per-cpu
rwlocks just takes things too far for my tastes.

In my humble opinion, this is a reasonnable compromise, and Stephen patch
version 4 is ok for me.

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help