Re: [0/14] GRO: Lots of microoptimisations
From: Benjamin LaHaise <hidden>
Date: 2009-06-16 16:35:53
From: Benjamin LaHaise <hidden>
Date: 2009-06-16 16:35:53
On Fri, Jun 12, 2009 at 04:48:33PM -0700, David Miller wrote:
I find a 500Mbps difference, due to just one single cache miss on every packet, simply astounding and unbelievable. But hey, it is what you are seeing, so something has to account for it. :)
The cache miss only accounts for ~50Mbpsi, it'd be nice if there was an easy way to get the whole 500Mbps back. The rest seems to be in the general overhead of the GRO code vs the normal NAPI rx path. The P4 Xeon is substantially worse at string operations than the Core 2 / Core i7 based Xeons, so I'm hoping to test and see if they do any better with the GRO code when I get access to a new machine soon. -ben