Re: [Linuxarm] Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support

[PATCH RFC 0/7] add socket to netdev page frag recycling support · Yunsheng Lin <hidden> · 2021-08-18
[PATCH RFC 3/7] net: add NAPI api to register and retrieve the page pool ptr · Yunsheng Lin <hidden> · 2021-08-18
[PATCH RFC 6/7] net: hns3: support tx recycling in the hns3 driver · Yunsheng Lin <hidden> · 2021-08-18
[PATCH RFC 1/7] page_pool: refactor the page pool to support multi alloc context · Yunsheng Lin <hidden> · 2021-08-18
[PATCH RFC 4/7] net: pfrag_pool: add pfrag pool support based on page pool · Yunsheng Lin <hidden> · 2021-08-18
[PATCH RFC 5/7] sock: support refilling pfrag from pfrag_pool · Yunsheng Lin <hidden> · 2021-08-18
[PATCH RFC 2/7] skbuff: add interface to manipulate frag count for tx recycling · Yunsheng Lin <hidden> · 2021-08-18
Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · Eric Dumazet <edumazet@google.com> · 2021-08-18
Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · Yunsheng Lin <hidden> · 2021-08-18
Re: [Linuxarm] Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · Yunsheng Lin <hidden> · 2021-08-23
Re: [Linuxarm] Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · Eric Dumazet <edumazet@google.com> · 2021-08-23
Re: [Linuxarm] Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · Yunsheng Lin <hidden> · 2021-08-24
Re: [Linuxarm] Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · David Ahern <hidden> · 2021-08-25
Re: [Linuxarm] Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · Eric Dumazet <edumazet@google.com> · 2021-08-25
Re: [Linuxarm] Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · David Ahern <hidden> · 2021-08-25
Re: [Linuxarm] Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · Eric Dumazet <edumazet@google.com> · 2021-08-25
Re: [Linuxarm] Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · David Ahern <hidden> · 2021-08-26
Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · David Ahern <hidden> · 2021-08-18
Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · Yunsheng Lin <hidden> · 2021-08-19
Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · David Ahern <hidden> · 2021-08-20
Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · Yunsheng Lin <hidden> · 2021-08-23
Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · David Ahern <hidden> · 2021-08-24
Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support · Yunsheng Lin <hidden> · 2021-08-24

From: David Ahern <hidden>
Date: 2021-08-25 16:39:06
Also in: lkml, netdev

On 8/25/21 9:32 AM, Eric Dumazet wrote:

On Wed, Aug 25, 2021 at 9:29 AM David Ahern [off-list ref] wrote:

quoted

On 8/23/21 8:04 AM, Eric Dumazet wrote:

quoted


It seems PAGE_ALLOC_COSTLY_ORDER is mostly related to pcp page, OOM, memory
compact and memory isolation, as the test system has a lot of memory installed
(about 500G, only 3-4G is used), so I used the below patch to test the max
possible performance improvement when making TCP frags twice bigger, and
the performance improvement went from about 30Gbit to 32Gbit for one thread
iperf tcp flow in IOMMU strict mode,

This is encouraging, and means we can do much better.

Even with SKB_FRAG_PAGE_ORDER  set to 4, typical skbs will need 3 mappings

1) One for the headers (in skb->head)
2) Two page frags, because one TSO packet payload is not a nice power-of-two.

interesting observation. I have noticed 17 with the ZC API. That might
explain the less than expected performance bump with iommu strict mode.

Note that if application is using huge pages, things get better after

commit 394fcd8a813456b3306c423ec4227ed874dfc08b
Author: Eric Dumazet [off-list ref]
Date:   Thu Aug 20 08:43:59 2020 -0700

    net: zerocopy: combine pages in zerocopy_sg_from_iter()

    Currently, tcp sendmsg(MSG_ZEROCOPY) is building skbs with order-0
fragments.
    Compared to standard sendmsg(), these skbs usually contain up to
16 fragments
    on arches with 4KB page sizes, instead of two.

    This adds considerable costs on various ndo_start_xmit() handlers,
    especially when IOMMU is in the picture.

    As high performance applications are often using huge pages,
    we can try to combine adjacent pages belonging to same
    compound page.

    Tested on AMD Rome platform, with IOMMU, nominal single TCP flow speed
    is roughly doubled (~55Gbit -> ~100Gbit), when user application
    is using hugepages.

    For reference, nominal single TCP flow speed on this platform
    without MSG_ZEROCOPY is ~65Gbit.

    Signed-off-by: Eric Dumazet [off-list ref]
    Cc: Willem de Bruijn [off-list ref]
    Signed-off-by: David S. Miller [off-list ref]

Ideally the gup stuff should really directly deal with hugepages, so
that we avoid
all these crazy refcounting games on the per-huge-page central refcount.

thanks for the pointer. I need to revisit my past attempt to get iperf3
working with hugepages.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help