Re: [PATCH net-next] net: mana: Force full-page RX buffers for 4K page size on specific systems.
From: Paolo Abeni <pabeni@redhat.com>
Date: 2026-03-03 10:56:35
Also in:
linux-hyperv, linux-rdma, lkml
On 2/27/26 11:15 AM, Dipayaan Roy wrote:
quoted hunk ↗ jump to hunk
On certain systems configured with 4K PAGE_SIZE, utilizing page_pool fragments for RX buffers results in a significant throughput regression. Profiling reveals that this regression correlates with high overhead in the fragment allocation and reference counting paths on these specific platforms, rendering the multi-buffer-per-page strategy counterproductive. To mitigate this, bypass the page_pool fragment path and force a single RX packet per page allocation when all the following conditions are met: 1. The system is configured with a 4K PAGE_SIZE. 2. A processor-specific quirk is detected via SMBIOS Type 4 data. This approach restores expected line-rate performance by ensuring predictable RX refill behavior on affected hardware. There is no behavioral change for systems using larger page sizes (16K/64K), or platforms where this processor-specific quirk do not apply. Signed-off-by: Dipayaan Roy <redacted> --- .../net/ethernet/microsoft/mana/gdma_main.c | 120 ++++++++++++++++++ drivers/net/ethernet/microsoft/mana/mana_en.c | 23 +++- include/net/mana/gdma.h | 10 ++ 3 files changed, 151 insertions(+), 2 deletions(-)diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c index 0055c231acf6..26bbe736a770 100644 --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c@@ -9,6 +9,7 @@ #include <linux/msi.h> #include <linux/irqdomain.h> #include <linux/export.h> +#include <linux/dmi.h> #include <net/mana/mana.h> #include <net/mana/hw_channel.h>@@ -1955,6 +1956,115 @@ static bool mana_is_pf(unsigned short dev_id) return dev_id == MANA_PF_DEVICE_ID; } +/* + * Table for Processor Version strings found from SMBIOS Type 4 information, + * for processors that needs to force single RX buffer per page quirk for + * meeting line rate performance with ARM64 + 4K pages. + * Note: These strings are exactly matched with version fetched from SMBIOS. + */ +static const char * const mana_single_rxbuf_per_page_quirk_tbl[] = { + "Cobalt 200", +}; + +static const char *smbios_get_string(const struct dmi_header *hdr, u8 idx) +{ + const u8 *start, *end; + u8 i; + + /* Indexing starts from 1. */ + if (!idx) + return NULL; + + start = (const u8 *)hdr + hdr->length; + end = start + SMBIOS_STR_AREA_MAX; + + for (i = 1; i < idx; i++) { + while (start < end && *start) + start++; + if (start < end) + start++; + if (start + 1 < end && start[0] == 0 && start[1] == 0) + return NULL; + } + + if (start >= end || *start == 0) + return NULL; + + return (const char *)start;
If I read correctly, the above sort of duplicate dmi_decode_table(). I think you are better of: - use the mana_get_proc_ver_from_smbios() decoder to store the SMBIOS_TYPE4_PROC_VERSION_OFFSET index into gd - do a 2nd walk with a different decoder to fetch the string at the specified index. /P