Re: Linux router performance (3c59x) (fwd)

From: Robert Olsson <hidden>
Date: 2003-03-18 09:54:48

Ralph Doncaster writes:
 > I haven't heard from Jamal or Dave, so perhaps someone from this list has
 > some wisdom to impart.
 > Currently the box in question is running a 67% system load with ~40kpps.
 > Here's the switch port stats that the 2 3c905cx cards are plugged into:

 Hello!

 First we do a lot of testing with routing path but have no experience
 with the hardware you have 3c59x or duron. 

 In general it seems hard to extrapolate performance X1 % CPU at X2 pps.
 You don't see CPU used in IRQ context and not in some of softIRQ's.

 I think a better way for this tests is to input "overload" so your 
 system gets saturated. You get the DoS test for free... After getting
 the throughput you have figure out what's your bottleneck CPU, PCI etc.

 > This is a box doing straight routing (no firewalling), with a full bgp4
 > routing table (>100k routes).  Kernel advanced router config option as
 > well as fastroute was chosen.

 The size of routing table itself has no effect... The challenge comes 
 when there are a high number of new "flows" per second so garbage 
 collection gets active. This can be seen with a program rtstat in the
 iproute2 package.

 Currently there is no driver with FASTROUTE support in the kernel so this 
 will not do you any good now.

 But Linux routing (and packet overload) performance is still very good.

 You can see performance numbers as well as profiles for different setups

 http://robur.slu.se/Linux/net-development/experiments/router-profile.html

 As seen packet memory allocation is one of the CPU consumers. And also
 we see that slab is not not fully per CPU so we are spinning in case of
 SMP.

 And as seen UP gives about 345 kpps. With skb recycling bump this up to
 507 kpps. The challenge for now is to get aggregated performance with SMP.

 Also remember that network and routing in particular is very much data
 transport which is DMA transfers from and to memory and these has to 
 interact with CPU/driver arbitrating for the bus to manage this DMA's.
 Latencies and serializations are not obvious at this level.

 Cheers.
						--ro

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help