Thread (13 messages) 13 messages, 5 authors, 2012-10-31

Re: [PATCH] net: allow configuration of the size of page in __netdev_alloc_frag

From: Konrad Rzeszutek Wilk <hidden>
Date: 2012-10-30 17:39:04

On Wed, Oct 24, 2012 at 06:43:20PM +0200, Eric Dumazet wrote:
On Wed, 2012-10-24 at 17:22 +0100, Ian Campbell wrote:
quoted
On Wed, 2012-10-24 at 16:21 +0100, Eric Dumazet wrote:
quoted
quoted
If you really have such problems, why locally generated TCP traffic
doesnt also have it ?
I think it does. The reason I noticed the original problem was that ssh
to the machine was virtually (no pun intended) unusable.
quoted
Your patch doesnt touch sk_page_frag_refill(), does it ?
That's right. It doesn't. When is (sk->sk_allocation & __GFP_WAIT) true?
Is it possible I'm just not hitting that case?
I hope not. GFP_KERNEL has __GFP_WAIT.
quoted
Is it possible that this only affects certain traffic patterns (I only
really tried ssh/scp and ping)? Or perhaps its just that the swiotlb is
only broken in one corner case and not the other.
Could you try a netperf -t TCP_STREAM ?
For fun I did a couple of tests - I setup two machines (one r8168, the other
e1000e) and tried to do netperf/netserver. Both of them are running a baremetal
kernel and one of them has 'iommu=soft swiotlb=force' to simulate the worst
case. This is using v3.7-rc3.

The r8169 is booted without any arguments, the e1000e is using 'iommu=soft
swiotlb=force'.

So r8169 -> e1000e, I get ~940 (this is odd, I expected that the e1000e
on the recv side would be using the bounce buffer, but then I realized it
sets up using pci_alloc_coherent an 'dma' pool).

The other way - e1000e -> r8169 got me around ~128. So it is the sending
side that ends up using the bounce buffer and it slows down considerably.

I also swapped the machine that had e1000e with a tg3 - and got around
the same numbers.

So all of this points to the swiotlb and to just make sure that nothing
was amiss I wrote a little driver that would allocate a compound page,
setup DMA mapping, do some writes, sync and unmap the DMA page. And it works
correctly - so swiotlb (and the xen variant) work right just right.
Attached for your fun.

Then I decided to try v3.6.3, with the same exact parameters.. and
the problem went away.

The e1000e -> r8169 which got me around ~128, now gets ~940! Still
using the swiotlb bounce buffer.

Because ssh use small packets, and small TCP packets dont use frags but
skb->head.

You mentioned a 70% drop of performance, but what test have you used
exactly ?
Note, I did not provide any arguments to netperf, but it did pick the
test you wanted:
netperf -H tst019
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tst019.dumpdata.com (192.168.101.39) port 0 AF_INET

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help