Re: [0/14] GRO: Lots of microoptimisations
From: David Miller <davem@davemloft.net>
Date: 2009-06-12 23:48:31
From: David Miller <davem@davemloft.net>
Date: 2009-06-12 23:48:31
From: Benjamin LaHaise <redacted> Date: Fri, 12 Jun 2009 12:09:26 -0400
I found at least one reason why: the first skb_shinfo()->frag_list touch in dev_gro_receive() was causing a cache miss. Adding a prefetch in the driver helps that a little bit, but there's still > 500Mbps difference.
I find a 500Mbps difference, due to just one single cache miss on every packet, simply astounding and unbelievable. But hey, it is what you are seeing, so something has to account for it. :)