Re: [PATCH v2] tcp: splice as many packets as possible at once
From: Jarek Poplawski <hidden>
Date: 2009-02-03 12:36:49
Also in:
lkml
On Tue, Feb 03, 2009 at 02:10:12PM +0300, Evgeniy Polyakov wrote:
On Tue, Feb 03, 2009 at 09:41:08AM +0000, Jarek Poplawski (jarkao2@gmail.com) wrote:quoted
quoted
1) Just like any other allocator we'll need to find a way to handle > PAGE_SIZE allocations, and thus add handling for compound pages etc. And exactly the drivers that want such huge SKB data areas on receive should be converted to use scatter gather page vectors in order to avoid multi-order pages and thus strains on the page allocator.I guess compound pages are handled by put_page() enough, but I don't think they should be main argument here, and I agree: scatter gather should be used where possible.Problem is to allocate them, since with the time memory will be quite fragmented, which will not allow to find a big enough page.
Yes, it's a problem, but I don't think the main one. Since we're currently concerned with zero-copy for splice I think we could concentrate on most common cases, and treat jumbo frames with best effort only: if there are free compound pages - fine, otherwise we fallback to slab and copy in splice.
NTA tried to solve this by not allowing to free the data allocated on the different CPU, contrary to what SLAB does. Modulo cache coherency improvements, it allows to combine freed chunks back into the pages and combine them in turn to get bigger contiguous areas suitable for the drivers which were not converted to use the scatter gather approach. I even believe that for some hardware it is the only way to deal with the jumbo frames.quoted
quoted
2) Space wastage and poor packing can be an issue. Even with SLAB/SLUB we get poor packing, look at Evegeniy's graphs that he made when writing his NTA patches.I'm a bit lost here: could you "remind" the way page space would be used/saved in your paged variant e.g. for ~1500B skbs?At least in NTA I used cache line alignment for smaller chunks, while SLAB uses power of two. Thus for 1500 MTU SLAB wastes about 500 bytes per packet (modulo size of the shared info structure).quoted
Yes, this looks reasonable. On the other hand, I think it would be nice to get some opinions of slab folks (incl. Evgeniy) on the expected efficiency of such a solution. (It seems releasing with put_page() will always have some cost with delayed reusing and/or waste of space.)Well, my opinion is rather biased here :)
I understand NTA could be better than slabs in above-mentioned cases, but I'm not sure you explaind enough your point on solving this zero-copy problem vs. NTA? Jarek P.