Re: [PATCH 3/4 v2 net-next] net: make GRO aware of skb->head_frag
From: Eric Dumazet <hidden>
Date: 2012-05-02 02:47:47
On Tue, 2012-05-01 at 16:10 -0700, Alexander Duyck wrote:
On 05/01/2012 03:58 PM, Alexander Duyck wrote:
quoted
Eric, I think I have found a bug, although it is not specific to this patch but it is related. It looks like the TCP coalesce code is causing tcpdump to fail when using frags. Based on the comments in the patch I am assuming you have an ixgbe adapter to test with as that is what I reproduced this on. To reproduce the issue all you need to do is run "tcpdump -i ethX > /dev/null" on one console, and on a second console run a netperf TCP_MAERTS test to some other server. Tcpdump will exit out with a message about bad address like this: tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes tcpdump: pcap_loop: recvfrom: Bad address 682 packets captured 2357 packets received by filter 1543 packets dropped by kernel A bisect of the issue tracked it down to: 1402d366019fedaa2b024f2bac06b7cc9a8782e1 is first bad commit commit 1402d366019fedaa2b024f2bac06b7cc9a8782e1 Author: Eric Dumazet [off-list ref] Date: Mon Apr 23 07:11:42 2012 +0000 tcp: introduce tcp_try_coalesce commit c8628155ece3 (tcp: reduce out_of_order memory use) took care of coalescing tcp segments provided by legacy devices (linear skbs) We extend this idea to fragged skbs, as their truesize can be heavy. ixgbe for example uses 256+1024+PAGE_SIZE/2 = 3328 bytes per segment. Use this coalescing strategy for receive queue too. This contributes to reduce number of tcp collapses, at minimal cost, and reduces memory overhead and packets drops. Signed-off-by: Eric Dumazet [off-list ref] Cc: Neal Cardwell [off-list ref] Cc: Tom Herbert [off-list ref] Cc: Maciej Żenczykowski [off-list ref] Cc: Ilpo Järvinen [off-list ref] Acked-by: Neal Cardwell [off-list ref] Signed-off-by: David S. Miller [off-list ref] :040000 040000 8ca3e0b4e6c6a8f375fd800069d24203880623f3 2576d34c5c9cfc717a11e2ebe054143956716b93 M net I suspect we are dealing with either a shared or cloned skb in this case, though I haven't verified which it is yet. Thanks, AlexOne additional note. It looks like LRO and GRO need to be disabled to trigger the bug. If either of them are enabled it doesn't seem to occur. Likely due to the fact that they are doing the coalescing before it gets up to the tcp_try_coalesce call.
Thanks Alex, I'll take a look. It seems my tcpdump is different than yours.