Thread (14 messages) 14 messages, 5 authors, 2014-02-11

Re: 8% performance improved by change tap interact with kernel stack

From: Stephen Hemminger <stephen@networkplumber.org>
Date: 2014-01-28 16:58:34
Also in: kvm

On Tue, 28 Jan 2014 12:33:25 +0200
"Michael S. Tsirkin" [off-list ref] wrote:
On Tue, Jan 28, 2014 at 06:19:02PM +0800, Qin Chuanyu wrote:
quoted
On 2014/1/28 17:41, Michael S. Tsirkin wrote:
quoted
quoted
quoted
I think it's okay - IIUC this way we are processing xmit directly
instead of going through softirq.
Was meaning to try this - I'm glad you are looking into this.

Could you please check latency results?
netperf UDP_RR 512
test model: VM->host->host

modified before : 11108
modified after  : 11480

3% gained by this patch
Nice.
What about CPU utilization?
It's trivially easy to speed up networking by
burning up a lot of CPU so we must make sure it's
not doing that.
And I think we should see some tests with TCP as well, and
try several message sizes.
Yes, by burning up more CPU we could get better performance easily.
So I have bond vhost thread and interrupt of nic on CPU1 while testing.

modified before, the idle of CPU1 is 0%-1% while testing.
and after modify, the idle of CPU1 is 2%-3% while testing

TCP also could gain from this, but pps is less than UDP, so I think
the improvement would be not so obviously.
Still need to test this doesn't regress but overall looks convincing to me.
Could you send a patch, accompanied by testing results for
throughput latency and cpu utilization for tcp and udp
with various message sizes?

Thanks!
There are a couple potential problems with this. The primary one is
that now you are violating the explicit assumptions about when netif_receive_skb()
can be called and because of that it may break things all over the place.

 *
 *	netif_receive_skb() is the main receive data processing function.
 *	It always succeeds. The buffer may be dropped during processing
 *	for congestion control or by the protocol layers.
 *
 *	This function may only be called from softirq context and interrupts
 *	should be enabled.

At a minimum, softirq (BH) and preempt must be disabled.

Another potential problem is that since a softirq is not used, the kernel stack
maybe much larger.

Maybe a better way would be implementing some form of NAPI in the TUN device?

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help