Thread (30 messages) 30 messages, 5 authors, 2023-07-26

Re: [PATCH RFC net-next v2 2/7] net: page_pool: place frag_* fields in one cacheline

From: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: 2023-07-26 10:41:26
Also in: lkml

From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Date: Wed, 26 Jul 2023 11:13:26 +0300
Apologies for the late reply, I was on vacation and start going
through my email piles...
No worries. I remember having to grind through hundreds of mails after
each vacation :s :D
On Tue, 18 Jul 2023 at 16:52, Alexander Lobakin
[off-list ref] wrote:
quoted
From: Jesper Dangaard Brouer <redacted>
Date: Fri, 14 Jul 2023 20:37:39 +0200
quoted

On 14/07/2023 19.08, Alexander Lobakin wrote:
quoted
On x86_64, frag_* fields of struct page_pool are scattered across two
cachelines despite the summary size of 24 bytes. The last field,
::frag_users, is pushed out to the next one, sharing it with
::alloc_stats.
All three fields are used in pretty much the same places. There are some
holes and cold members to move around. Move frag_* one block up, placing
them right after &page_pool_params perfectly at the beginning of CL2.
This doesn't do any meaningful to the second block, as those are some
destroy-path cold structures, and doesn't do anything to ::alloc_stats,
which still starts at 200-byte offset, 8 bytes after CL3 (still fitting
into 1 cacheline).
On my setup, this yields 1-2% of Mpps when using PP frags actively.
When it comes to 32-bit architectures with 32-byte CL: &page_pool_params
plus ::pad is 44 bytes, the block taken care of is 16 bytes within one
CL, so there should be at least no regressions from the actual change.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
  include/net/page_pool.h | 10 +++++-----
  1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index 829dc1f8ba6b..212d72b5cfec 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -130,16 +130,16 @@ static inline u64
*page_pool_ethtool_stats_get(u64 *data, void *stats)
  struct page_pool {
      struct page_pool_params p;
  +    long frag_users;
+    struct page *frag_page;
+    unsigned int frag_offset;
+    u32 pages_state_hold_cnt;
I think this is okay, but I want to highlight that:
 - pages_state_hold_cnt and pages_state_release_cnt
need to be kept on separate cache-lines.
They're pretty far away from each other. I moved hold_cnt here as well
to keep it stacked with frag_offset and avoid introducing 32-bit holes.
This is to prevent cache line bouncing and/or false sharing right?
The change seems fine to me as well but mind adding a comment about
this when you resend?
Right. Sure, why not.
Thanks
/Ilias
quoted
quoted
quoted
+
      struct delayed_work release_dw;
      void (*disconnect)(void *);
      unsigned long defer_start;
      unsigned long defer_warn;
  -    u32 pages_state_hold_cnt;
-    unsigned int frag_offset;
-    struct page *frag_page;
-    long frag_users;
-
  #ifdef CONFIG_PAGE_POOL_STATS
      /* these stats are incremented while in softirq context */
      struct page_pool_alloc_stats alloc_stats;
Thanks,
Olek
Thanks,
Olek
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help