Thread (41 messages) 41 messages, 9 authors, 2021-05-11

Re: [PATCH net-next v3 0/5] page_pool: recycle buffers

From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Date: 2021-05-06 12:58:27
Also in: bpf, linux-mm, linux-rdma, lkml

On Thu, May 06, 2021 at 08:34:48PM +0800, Yunsheng Lin wrote:
On 2021/5/1 0:24, Ilias Apalodimas wrote:
quoted
[...]
quoted
quoted
quoted
1. skb frag page recycling do not need "struct xdp_rxq_info" or
   "struct xdp_mem_info" to bond the relation between "struct page" and
   "struct page_pool", which seems uncessary at this point if bonding
   a "struct page_pool" pointer directly in "struct page" does not cause
   space increasing.
We can't do that. The reason we need those structs is that we rely on the
existing XDP code, which already recycles it's buffers, to enable
recycling.  Since we allocate a page per packet when using page_pool for a
driver , the same ideas apply to an SKB and XDP frame. We just recycle the
I am not really familar with XDP here, but a packet from hw is either a
"struct xdp_frame/xdp_buff" for XDP or a "struct sk_buff" for TCP/IP stack,
a packet can not be both "struct xdp_frame/xdp_buff" and "struct sk_buff" at
the same time, right?
Yes, but the payload is irrelevant in both cases and that's what we use
page_pool for.  You can't use this patchset unless your driver usues
build_skb().  So in both cases you just allocate memory for the payload and
I am not sure I understood why build_skb() matters here. If the head data of
a skb is a page frag and is from page pool, then it's page->signature should be
PP_SIGNATURE, otherwise it's page->signature is zero, so a recyclable skb does
not require it's head data being from a page pool, right?
Correct, and that's the big improvement compared to the original RFC.
The wording was a bit off in my initial response.  I was trying to point out
you can recycle *any* buffer coming from page_pool and one of the ways you can
do that in your driver, is use build_skb() while the payload is allocated by
page_pool.
quoted
decide what the wrap the buffer with (XDP or SKB) later.
[...]
quoted
quoted
I am not sure I understand what you meant by "free the skb", does it mean
that kfree_skb() is called to free the skb.
Yes
quoted
As my understanding, if the skb completely own the page(which means page_count()
== 1) when kfree_skb() is called, __page_pool_put_page() is called, otherwise
page_ref_dec() is called, which is exactly what page_pool_atomic_sub_if_positive()
try to handle it atomically.
Not really, the opposite is happening here. If the pp_recycle bit is set we
will always call page_pool_return_skb_page().  If the page signature matches
the 'magic' set by page pool we will always call xdp_return_skb_frame() will
end up calling __page_pool_put_page(). If the refcnt is 1 we'll try
to recycle the page.  If it's not we'll release it from page_pool (releasing
some internal references we keep) unmap the buffer and decrement the refcnt.
Yes, I understood the above is what the page pool do now.

But the question is who is still holding an extral reference to the page when
kfree_skb()? Perhaps a cloned and pskb_expand_head()'ed skb is holding an extral
reference to the same page? So why not just do a page_ref_dec() if the orginal skb
is freed first, and call __page_pool_put_page() when the cloned skb is freed later?
So that we can always reuse the recyclable page from a recyclable skb. This may
make the page_pool_destroy() process delays longer than before, I am supposed the
page_pool_destroy() delaying for cloned skb case does not really matters here.

If the above works, I think the samiliar handling can be added to RX zerocopy if
the RX zerocopy also hold extral references to the recyclable page from a recyclable
skb too?
Right, this sounds doable, but I'll have to go back code it and see if it
really makes sense.  However I'd still prefer the support to go in as-is
(including the struct xdp_mem_info in struct page, instead of a page_pool
pointer).

There's a couple of reasons for that.  If we keep the struct xdp_mem_info we
can in the future recycle different kind of buffers using __xdp_return().
And this is a non intrusive change if we choose to store the page pool address
directly in the future.  It just affects the internal contract between the
page_pool code and struct page.  So it won't affect any drivers that already
use the feature.
Regarding the page_ref_dec(), which as I said sounds doable, I'd prefer
playing it safe for now and getting rid of the buffers that somehow ended up
holding an extra reference.  Once this gets approved we can go back and try to
save the extra space.  I hope I am not wrong but the changes required to
support a few extra refcounts should not change the current patches much.

Thanks for taking the time on this!
/Ilias
quoted
[1] https://lore.kernel.org/netdev/154413868810.21735.572808840657728172.stgit@firesoul/ (local)

Cheers
/Ilias

.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help