Re: [PATCH net-next v5 3/5] page_pool: Allow drivers to hint on SKB recycling
From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Date: 2021-05-17 11:35:57
Also in:
bpf, linux-mm, linux-rdma, lkml
On Mon, May 17, 2021 at 07:10:09PM +0800, Yunsheng Lin wrote:
On 2021/5/17 17:36, Ilias Apalodimas wrote: >>quoted
quoted
Even if when skb->pp_recycle is 1, pages allocated from page allocator directly or page pool are both supported, so it seems page->signature need to be reliable to indicate a page is indeed owned by a page pool, which means the skb->pp_recycle is used mainly to short cut the code path for skb->pp_recycle is 0 case, so that the page->signature does not need checking?Yes, the idea for the recycling bit, is that you don't have to fetch the page in cache do do more processing (since freeing is asynchronous and we can't have any guarantees on what the cache will have at that point). So we are trying to affect the existing release path a less as possible. However it's that new skb bit that triggers the whole path. What you propose could still be doable though. As you said we can add the page pointer to struct page when we allocate a page_pool page and never reset it when we recycle the buffer. But I don't think there will be any performance impact whatsoever. So I prefer the 'visible' approach, at least forsetting and unsetting the page_pool ptr every time the page is recycled may cause a cache bouncing problem when rx cleaning and skb releasing is not happening on the same cpu.
In our case since the skb is asynchronous and not protected by a NAPI context, the buffer wont end up in the 'fast' page pool cache. So we'll recycle by calling page_pool_recycle_in_ring() not page_pool_recycle_in_cache(). Which means that the page you recycled will be re-filled later, in batches, when page_pool_refill_alloc_cache() is called to refill the fast cache. I am not i saying it might not happen, but I don't really know if it's going to make a difference or not. So I just really prefer taking this as is and perhaps later, when 40/100gbit drivers start using it we can justify the optimization (along with supporting the split page model). Thanks /Ilias
quoted
the first iteration. Thanks /Ilias .