[PATCH] rcu: increment quiescent state counter in ksoftirqd()

[RFT 0/4] Netfilter/iptables performance improvements · Stephen Hemminger <hidden> · 2009-02-18
[RFT 2/4] Add mod_timer_noact · Stephen Hemminger <hidden> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · Ingo Molnar <hidden> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · David Miller <davem@davemloft.net> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · Ingo Molnar <hidden> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · Jarek Poplawski <hidden> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · Ingo Molnar <hidden> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · Patrick McHardy <hidden> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · David Miller <davem@davemloft.net> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · Ingo Molnar <hidden> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · David Miller <davem@davemloft.net> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · Peter Zijlstra <peterz@infradead.org> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · David Miller <davem@davemloft.net> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · Stephen Hemminger <hidden> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · Patrick McHardy <hidden> · 2009-02-18
[patch] timers: add mod_timer_pending() · Ingo Molnar <hidden> · 2009-02-18
Re: [patch] timers: add mod_timer_pending() · Patrick McHardy <hidden> · 2009-02-18
Re: [patch] timers: add mod_timer_pending() · Ingo Molnar <hidden> · 2009-02-18
Re: [patch] timers: add mod_timer_pending() · Patrick McHardy <hidden> · 2009-02-18
Re: [patch] timers: add mod_timer_pending() · Ingo Molnar <hidden> · 2009-02-18
Re: [patch] timers: add mod_timer_pending() · Oleg Nesterov <oleg@redhat.com> · 2009-02-18
Re: [patch] timers: add mod_timer_pending() · Ingo Molnar <hidden> · 2009-02-18
Re: [patch] timers: add mod_timer_pending() · Oleg Nesterov <oleg@redhat.com> · 2009-02-18
Re: [patch] timers: add mod_timer_pending() · Ingo Molnar <hidden> · 2009-02-18
Re: [RFT 2/4] Add mod_timer_noact · Patrick McHardy <hidden> · 2009-02-18
[RFT 4/4] netfilter: Get rid of central rwlock in tcp conntracking · Stephen Hemminger <hidden> · 2009-02-18
Re: [RFT 4/4] netfilter: Get rid of central rwlock in tcp conntracking · Patrick McHardy <hidden> · 2009-02-18
Re: [RFT 4/4] netfilter: Get rid of central rwlock in tcp conntracking · Eric Dumazet <hidden> · 2009-02-18
Re: [RFT 4/4] netfilter: Get rid of central rwlock in tcp conntracking · Stephen Hemminger <hidden> · 2009-02-19
[PATCH] netfilter: finer grained nf_conn locking · Eric Dumazet <hidden> · 2009-03-28
Re: [PATCH] netfilter: finer grained nf_conn locking · Stephen Hemminger <hidden> · 2009-03-29
Re: [PATCH] netfilter: finer grained nf_conn locking · Eric Dumazet <hidden> · 2009-03-30
Re: [PATCH] netfilter: finer grained nf_conn locking · Stephen Hemminger <hidden> · 2009-03-30
Re: [PATCH] netfilter: finer grained nf_conn locking · Patrick McHardy <hidden> · 2009-04-06
Re: [PATCH] netfilter: finer grained nf_conn locking · Jan Engelhardt <hidden> · 2009-04-06
Re: [PATCH] netfilter: finer grained nf_conn locking · Stephen Hemminger <hidden> · 2009-04-06
Re: [PATCH] netfilter: finer grained nf_conn locking · Rick Jones <hidden> · 2009-03-30
Re: [PATCH] netfilter: finer grained nf_conn locking · Eric Dumazet <hidden> · 2009-03-30
Re: [PATCH] netfilter: finer grained nf_conn locking · Jesper Dangaard Brouer <hidden> · 2009-03-30
Re: [PATCH] netfilter: finer grained nf_conn locking · Eric Dumazet <hidden> · 2009-03-30
Re: [PATCH] netfilter: finer grained nf_conn locking · Jesper Dangaard Brouer <hidden> · 2009-03-30
Re: [PATCH] netfilter: finer grained nf_conn locking · Eric Dumazet <hidden> · 2009-03-30
Re: [PATCH] netfilter: finer grained nf_conn locking · Jesper Dangaard Brouer <hidden> · 2009-03-30
Re: [PATCH] netfilter: finer grained nf_conn locking · Rick Jones <hidden> · 2009-03-30
Re: [PATCH] netfilter: finer grained nf_conn locking · Jesper Dangaard Brouer <hidden> · 2009-03-31
Re: [PATCH] netfilter: finer grained nf_conn locking · Eric Dumazet <hidden> · 2009-03-31
Re: [PATCH] netfilter: finer grained nf_conn locking · Rick Jones <hidden> · 2009-03-31
Re: [PATCH] netfilter: finer grained nf_conn locking · Jesper Dangaard Brouer <hidden> · 2009-03-31
Re: [RFT 4/4] netfilter: Get rid of central rwlock in tcp conntracking · David Miller <davem@davemloft.net> · 2009-02-18
Re: [RFT 4/4] netfilter: Get rid of central rwlock in tcp conntracking · Patrick McHardy <hidden> · 2009-02-18
Re: [RFT 4/4] netfilter: Get rid of central rwlock in tcp conntracking · Stephen Hemminger <hidden> · 2009-02-18
[RFT 1/4] iptables: lock free counters · Stephen Hemminger <hidden> · 2009-02-18
Re: [RFT 1/4] iptables: lock free counters · Patrick McHardy <hidden> · 2009-02-18
[PATCH] iptables: lock free counters · Stephen Hemminger <hidden> · 2009-02-19
Re: [PATCH] iptables: lock free counters · Eric Dumazet <hidden> · 2009-02-19
Re: [PATCH] iptables: lock free counters · Rick Jones <hidden> · 2009-02-19
Re: [PATCH] iptables: lock free counters · Stephen Hemminger <hidden> · 2009-02-20
Re: [PATCH] iptables: lock free counters · Rick Jones <hidden> · 2009-02-20
Re: [PATCH] iptables: lock free counters · Patrick McHardy <hidden> · 2009-02-20
Re: [PATCH] iptables: lock free counters · Rick Jones <hidden> · 2009-02-20
Re: [PATCH] iptables: lock free counters · Rick Jones <hidden> · 2009-02-21
Re: [PATCH] iptables: lock free counters · Patrick McHardy <hidden> · 2009-02-20
[PATCH] iptables: xt_hashlimit fix · Eric Dumazet <hidden> · 2009-02-20
Re: [PATCH] iptables: xt_hashlimit fix · Jan Engelhardt <hidden> · 2009-02-20
Re: [PATCH] iptables: xt_hashlimit fix · Jan Engelhardt <hidden> · 2009-02-28
Re: [PATCH] iptables: xt_hashlimit fix · Eric Dumazet <hidden> · 2009-02-28
Re: [PATCH] iptables: xt_hashlimit fix · Jan Engelhardt <hidden> · 2009-02-28
Re: [PATCH] iptables: xt_hashlimit fix · Patrick McHardy <hidden> · 2009-02-24
Re: [PATCH] iptables: lock free counters · Eric Dumazet <hidden> · 2009-02-27
[PATCH] rcu: increment quiescent state counter in ksoftirqd() · Eric Dumazet <hidden> · 2009-02-27
Re: [PATCH] rcu: increment quiescent state counter in ksoftirqd() · Paul E. McKenney <hidden> · 2009-02-27
Re: [PATCH] iptables: lock free counters · Patrick McHardy <hidden> · 2009-03-02
Re: [PATCH] iptables: lock free counters · Eric Dumazet <hidden> · 2009-03-02
Re: [PATCH] iptables: lock free counters · Patrick McHardy <hidden> · 2009-03-02
Re: [PATCH] iptables: lock free counters · Stephen Hemminger <hidden> · 2009-03-02
Re: [PATCH] iptables: lock free counters · Patrick McHardy <hidden> · 2009-03-02
Re: [PATCH] iptables: lock free counters · Paul E. McKenney <hidden> · 2009-03-02
Re: [PATCH] iptables: lock free counters · Eric Dumazet <hidden> · 2009-03-02
[RFT 3/4] Use mod_timer_noact to remove nf_conntrack_lock · Stephen Hemminger <hidden> · 2009-02-18
Re: [RFT 3/4] Use mod_timer_noact to remove nf_conntrack_lock · Patrick McHardy <hidden> · 2009-02-18
Re: [RFT 3/4] Use mod_timer_noact to remove nf_conntrack_lock · Jarek Poplawski <hidden> · 2009-02-18
Re: [RFT 3/4] Use mod_timer_noact to remove nf_conntrack_lock · Patrick McHardy <hidden> · 2009-02-18
Re: [RFT 3/4] Use mod_timer_noact to remove nf_conntrack_lock · Eric Dumazet <hidden> · 2009-02-18
Re: [RFT 3/4] Use mod_timer_noact to remove nf_conntrack_lock · Patrick McHardy <hidden> · 2009-02-18
Re: [RFT 3/4] Use mod_timer_noact to remove nf_conntrack_lock · Eric Dumazet <hidden> · 2009-02-18
Re: [RFT 3/4] Use mod_timer_noact to remove nf_conntrack_lock · Patrick McHardy <hidden> · 2009-02-18
Re: [RFT 0/4] Netfilter/iptables performance improvements · Eric Dumazet <hidden> · 2009-02-18

STALE6268d

From: Eric Dumazet <hidden>
Date: 2009-02-27 16:08:38
Also in: lkml, netfilter-devel
Subsystem: the rest · Maintainer: Linus Torvalds

Eric Dumazet a écrit :

Eric Dumazet a écrit :

quoted

Stephen Hemminger a écrit :

quoted

The reader/writer lock in ip_tables is acquired in the critical path of
processing packets and is one of the reasons just loading iptables can cause
a 20% performance loss. The rwlock serves two functions:

1) it prevents changes to table state (xt_replace) while table is in use.
   This is now handled by doing rcu on the xt_table. When table is
   replaced, the new table(s) are put in and the old one table(s) are freed
   after RCU period.

2) it provides synchronization when accesing the counter values.
   This is now handled by swapping in new table_info entries for each cpu
   then summing the old values, and putting the result back onto one
   cpu.  On a busy system it may cause sampling to occur at different
   times on each cpu, but no packet/byte counts are lost in the process.

Signed-off-by: Stephen Hemminger <redacted>

Acked-by: Eric Dumazet <redacted>

Sucessfully tested on my dual quad core machine too, but iptables only (no ipv6 here)

BTW, my new "tbench 8" result is 2450 MB/s, (it was 2150 MB/s not so long ago)

Thanks Stephen, thats very cool stuff, yet another rwlock out of kernel :)

While testing multicast flooding stuff, I found that "iptables -nvL" can 
have a *very* slow response time on my dual quad core machine...


# time iptables -nvL
Chain INPUT (policy ACCEPT 416M packets, 64G bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 401M packets, 62G bytes)
 pkts bytes target     prot opt in     out     source               destination

real    0m1.810s  <<<< HERE >>>>
user    0m0.000s
sys     0m0.001s


CONFIG_NO_HZ=y
CONFIG_HZ_1000=y
CONFIG_HZ=1000

One cpu is 100% handling softirqs, could it be the problem ?

Cpu0  :  1.0%us, 14.7%sy,  0.0%ni, 83.3%id,  0.0%wa,  0.0%hi,  1.0%si,  0.0%st
Cpu1  :  3.6%us, 23.2%sy,  0.0%ni, 71.6%id,  0.0%wa,  0.0%hi,  1.7%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,100.0%si,  0.0%st
Cpu3  :  2.7%us, 23.9%sy,  0.0%ni, 71.1%id,  0.7%wa,  0.0%hi,  1.7%si,  0.0%st
Cpu4  :  1.3%us, 14.3%sy,  0.0%ni, 83.3%id,  0.0%wa,  0.0%hi,  1.0%si,  0.0%st
Cpu5  :  1.0%us, 14.2%sy,  0.0%ni, 83.4%id,  0.0%wa,  0.0%hi,  1.3%si,  0.0%st
Cpu6  :  0.3%us,  7.0%sy,  0.0%ni, 92.4%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu7  :  0.7%us,  8.0%sy,  0.0%ni, 90.0%id,  0.7%wa,  0.0%hi,  0.7%si,  0.0%st

Hi Paul

I found following patch helps if one cpu is looping inside ksoftirqd()

synchronize_rcu() now completes in 40 ms instead of 1800 ms.

Thank you

[PATCH] rcu: increment quiescent state counter in ksoftirqd()

If a machine is flooded by network frames, a cpu can loop 100% of its time
inside ksoftirqd() without calling schedule().
This can delay RCU grace period to insane values. 

Adding rcu_qsctr_inc() call in ksoftirqd() solves this problem.

Signed-off-by: Eric Dumazet <redacted>
---

diff --git a/kernel/softirq.c b/kernel/softirq.c
index bdbe9de..9041ea7 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c

@@ -626,6 +626,7 @@ static int ksoftirqd(void * __bind_cpu)
 			preempt_enable_no_resched();
 			cond_resched();
 			preempt_disable();
+			rcu_qsctr_inc((long)__bind_cpu);
 		}
 		preempt_enable();
 		set_current_state(TASK_INTERRUPTIBLE);

--

To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help