Thread (2 messages) 2 messages, 2 authors, 2024-06-11

RE: [PATCH net-next] net: mana: Add support for variable page sizes of ARM64

From: Michael Kelley <hidden>
Date: 2024-06-11 16:34:39
Also in: bpf, linux-rdma, lkml, netdev

From: LKML haiyangz <redacted> On Behalf Of Haiyang Zhang Sent: Monday, June 10, 2024 2:23 PM
As defined by the MANA Hardware spec, the queue size for DMA is 4KB
minimal, and power of 2.
You say the hardware requires 4K "minimal". But the definitions in this
patch hardcode to 4K, as if that's the only choice. Is the hardcoding to
4K a design decision made to simplify the MANA driver?
To support variable page sizes (4KB, 16KB, 64KB) of ARM64, define
A minor nit, but "variable" page size doesn't seem like quite the right
description -- both here and in the Subject line.  On ARM64, the page size
is a choice among a few fixed options.  Perhaps call it support for "page sizes
other than 4K"?
quoted hunk ↗ jump to hunk
the minimal queue size as a macro separate from the PAGE_SIZE, which
we always assumed it to be 4KB before supporting ARM64.
Also, update the relevant code related to size alignment, DMA region
calculations, etc.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 drivers/net/ethernet/microsoft/Kconfig        |  2 +-
 .../net/ethernet/microsoft/mana/gdma_main.c   |  8 +++----
 .../net/ethernet/microsoft/mana/hw_channel.c  | 22 +++++++++----------
 drivers/net/ethernet/microsoft/mana/mana_en.c |  8 +++----
 .../net/ethernet/microsoft/mana/shm_channel.c |  9 ++++----
 include/net/mana/gdma.h                       |  7 +++++-
 include/net/mana/mana.h                       |  3 ++-
 7 files changed, 33 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/Kconfig
b/drivers/net/ethernet/microsoft/Kconfig
index 286f0d5697a1..901fbffbf718 100644
--- a/drivers/net/ethernet/microsoft/Kconfig
+++ b/drivers/net/ethernet/microsoft/Kconfig
@@ -18,7 +18,7 @@ if NET_VENDOR_MICROSOFT
 config MICROSOFT_MANA
 	tristate "Microsoft Azure Network Adapter (MANA) support"
 	depends on PCI_MSI
-	depends on X86_64 || (ARM64 && !CPU_BIG_ENDIAN && ARM64_4K_PAGES)
+	depends on X86_64 || (ARM64 && !CPU_BIG_ENDIAN)
 	depends on PCI_HYPERV
 	select AUXILIARY_BUS
 	select PAGE_POOL
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c
b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 1332db9a08eb..c9df942d0d02 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -182,7 +182,7 @@ int mana_gd_alloc_memory(struct gdma_context *gc,
unsigned int length,
 	dma_addr_t dma_handle;
 	void *buf;

-	if (length < PAGE_SIZE || !is_power_of_2(length))
+	if (length < MANA_MIN_QSIZE || !is_power_of_2(length))
 		return -EINVAL;

 	gmi->dev = gc->dev;
@@ -717,7 +717,7 @@ EXPORT_SYMBOL_NS(mana_gd_destroy_dma_region,
NET_MANA);
 static int mana_gd_create_dma_region(struct gdma_dev *gd,
 				     struct gdma_mem_info *gmi)
 {
-	unsigned int num_page = gmi->length / PAGE_SIZE;
+	unsigned int num_page = gmi->length / MANA_MIN_QSIZE;
This calculation seems a bit weird when using MANA_MIN_QSIZE. The
number of pages, and the construction of the page_addr_list array
a few lines later, seem unrelated to the concept of a minimum queue
size. Is the right concept really a "mapping chunk", and num_page
would conceptually be "num_chunks", or something like that?  Then
a queue must be at least one chunk in size, but that's derived from the
chunk size, and is not the core concept.

Another approach might be to just call it "MANA_PAGE_SIZE", like
has been done with HV_HYP_PAGE_SIZE.  HV_HYP_PAGE_SIZE exists to
handle exactly the same issue of the guest PAGE_SIZE potentially
being different from the fixed 4K size that must be used in host-guest
communication on Hyper-V.  Same thing here with MANA.
quoted hunk ↗ jump to hunk
 	struct gdma_create_dma_region_req *req = NULL;
 	struct gdma_create_dma_region_resp resp = {};
 	struct gdma_context *gc = gd->gdma_context;
@@ -727,7 +727,7 @@ static int mana_gd_create_dma_region(struct gdma_dev *gd,
 	int err;
 	int i;

-	if (length < PAGE_SIZE || !is_power_of_2(length))
+	if (length < MANA_MIN_QSIZE || !is_power_of_2(length))
 		return -EINVAL;

 	if (offset_in_page(gmi->virt_addr) != 0)
@@ -751,7 +751,7 @@ static int mana_gd_create_dma_region(struct gdma_dev *gd,
 	req->page_addr_list_len = num_page;

 	for (i = 0; i < num_page; i++)
-		req->page_addr_list[i] = gmi->dma_handle +  i * PAGE_SIZE;
+		req->page_addr_list[i] = gmi->dma_handle +  i * MANA_MIN_QSIZE;

 	err = mana_gd_send_request(gc, req_msg_size, req, sizeof(resp), &resp);
 	if (err)
diff --git a/drivers/net/ethernet/microsoft/mana/hw_channel.c
b/drivers/net/ethernet/microsoft/mana/hw_channel.c
index bbc4f9e16c98..038dc31e09cd 100644
--- a/drivers/net/ethernet/microsoft/mana/hw_channel.c
+++ b/drivers/net/ethernet/microsoft/mana/hw_channel.c
@@ -362,12 +362,12 @@ static int mana_hwc_create_cq(struct hw_channel_context
*hwc, u16 q_depth,
 	int err;

 	eq_size = roundup_pow_of_two(GDMA_EQE_SIZE * q_depth);
-	if (eq_size < MINIMUM_SUPPORTED_PAGE_SIZE)
-		eq_size = MINIMUM_SUPPORTED_PAGE_SIZE;
+	if (eq_size < MANA_MIN_QSIZE)
+		eq_size = MANA_MIN_QSIZE;

 	cq_size = roundup_pow_of_two(GDMA_CQE_SIZE * q_depth);
-	if (cq_size < MINIMUM_SUPPORTED_PAGE_SIZE)
-		cq_size = MINIMUM_SUPPORTED_PAGE_SIZE;
+	if (cq_size < MANA_MIN_QSIZE)
+		cq_size = MANA_MIN_QSIZE;

 	hwc_cq = kzalloc(sizeof(*hwc_cq), GFP_KERNEL);
 	if (!hwc_cq)
@@ -429,7 +429,7 @@ static int mana_hwc_alloc_dma_buf(struct
hw_channel_context *hwc, u16 q_depth,

 	dma_buf->num_reqs = q_depth;

-	buf_size = PAGE_ALIGN(q_depth * max_msg_size);
+	buf_size = MANA_MIN_QALIGN(q_depth * max_msg_size);

 	gmi = &dma_buf->mem_info;
 	err = mana_gd_alloc_memory(gc, buf_size, gmi);
@@ -497,8 +497,8 @@ static int mana_hwc_create_wq(struct hw_channel_context
*hwc,
 	else
 		queue_size = roundup_pow_of_two(GDMA_MAX_SQE_SIZE *
q_depth);

-	if (queue_size < MINIMUM_SUPPORTED_PAGE_SIZE)
-		queue_size = MINIMUM_SUPPORTED_PAGE_SIZE;
+	if (queue_size < MANA_MIN_QSIZE)
+		queue_size = MANA_MIN_QSIZE;

 	hwc_wq = kzalloc(sizeof(*hwc_wq), GFP_KERNEL);
 	if (!hwc_wq)
@@ -628,10 +628,10 @@ static int mana_hwc_establish_channel(struct
gdma_context *gc, u16 *q_depth,
 	init_completion(&hwc->hwc_init_eqe_comp);

 	err = mana_smc_setup_hwc(&gc->shm_channel, false,
-				 eq->mem_info.dma_handle,
-				 cq->mem_info.dma_handle,
-				 rq->mem_info.dma_handle,
-				 sq->mem_info.dma_handle,
+				 virt_to_phys(eq->mem_info.virt_addr),
+				 virt_to_phys(cq->mem_info.virt_addr),
+				 virt_to_phys(rq->mem_info.virt_addr),
+				 virt_to_phys(sq->mem_info.virt_addr),
This change seems unrelated to handling guest PAGE_SIZE values
other than 4K.  Does it belong in a separate patch?  Or maybe it just
needs an explanation in the commit message of this patch?
quoted hunk ↗ jump to hunk
 				 eq->eq.msix_index);
 	if (err)
 		return err;
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c
b/drivers/net/ethernet/microsoft/mana/mana_en.c
index d087cf954f75..6a891dbce686 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1889,10 +1889,10 @@ static int mana_create_txq(struct mana_port_context
*apc,
 	 *  to prevent overflow.
 	 */
 	txq_size = MAX_SEND_BUFFERS_PER_QUEUE * 32;
-	BUILD_BUG_ON(!PAGE_ALIGNED(txq_size));
+	BUILD_BUG_ON(!MANA_MIN_QALIGNED(txq_size));

 	cq_size = MAX_SEND_BUFFERS_PER_QUEUE * COMP_ENTRY_SIZE;
-	cq_size = PAGE_ALIGN(cq_size);
+	cq_size = MANA_MIN_QALIGN(cq_size);

 	gc = gd->gdma_context;
@@ -2189,8 +2189,8 @@ static struct mana_rxq *mana_create_rxq(struct
mana_port_context *apc,
 	if (err)
 		goto out;

-	rq_size = PAGE_ALIGN(rq_size);
-	cq_size = PAGE_ALIGN(cq_size);
+	rq_size = MANA_MIN_QALIGN(rq_size);
+	cq_size = MANA_MIN_QALIGN(cq_size);

 	/* Create RQ */
 	memset(&spec, 0, sizeof(spec));
diff --git a/drivers/net/ethernet/microsoft/mana/shm_channel.c
b/drivers/net/ethernet/microsoft/mana/shm_channel.c
index 5553af9c8085..9a54a163d8d1 100644
--- a/drivers/net/ethernet/microsoft/mana/shm_channel.c
+++ b/drivers/net/ethernet/microsoft/mana/shm_channel.c
@@ -6,6 +6,7 @@
 #include <linux/io.h>
 #include <linux/mm.h>

+#include <net/mana/gdma.h>
 #include <net/mana/shm_channel.h>

 #define PAGE_FRAME_L48_WIDTH_BYTES 6
@@ -183,7 +184,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool
reset_vf, u64 eq_addr,

 	/* EQ addr: low 48 bits of frame address */
 	shmem = (u64 *)ptr;
-	frame_addr = PHYS_PFN(eq_addr);
+	frame_addr = MANA_PFN(eq_addr);
 	*shmem = frame_addr & PAGE_FRAME_L48_MASK;
 	all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) <<
 		(frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS);
In mana_smc_setup_hwc() a few lines above this change, code using
PAGE_ALIGNED() is unchanged.  Is it correct that the eq/cq/rq/sq addresses
must be aligned to 64K if PAGE_SIZE is 64K?

Related, I wonder about how MANA_PFN() is defined. If PAGE_SIZE is 64K,
MANA_PFN() will first right-shift 16, then left shift 4. The net is right-shift 12,
corresponding to the 4K chunks that MANA expects. But that approach guarantees
that the rightmost 4 bits of the MANA PFN will always be zero. That's consistent
with requiring the addresses to be PAGE_ALIGNED() to 64K, but I'm unclear whether
that is really the requirement. You might compare with the definition of
HVPFN_DOWN(), which has a similar goal for Linux guests communicating with
Hyper-V.
quoted hunk ↗ jump to hunk
@@ -191,7 +192,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool
reset_vf, u64 eq_addr,

 	/* CQ addr: low 48 bits of frame address */
 	shmem = (u64 *)ptr;
-	frame_addr = PHYS_PFN(cq_addr);
+	frame_addr = MANA_PFN(cq_addr);
 	*shmem = frame_addr & PAGE_FRAME_L48_MASK;
 	all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) <<
 		(frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS);
@@ -199,7 +200,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool
reset_vf, u64 eq_addr,

 	/* RQ addr: low 48 bits of frame address */
 	shmem = (u64 *)ptr;
-	frame_addr = PHYS_PFN(rq_addr);
+	frame_addr = MANA_PFN(rq_addr);
 	*shmem = frame_addr & PAGE_FRAME_L48_MASK;
 	all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) <<
 		(frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS);
@@ -207,7 +208,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool
reset_vf, u64 eq_addr,

 	/* SQ addr: low 48 bits of frame address */
 	shmem = (u64 *)ptr;
-	frame_addr = PHYS_PFN(sq_addr);
+	frame_addr = MANA_PFN(sq_addr);
 	*shmem = frame_addr & PAGE_FRAME_L48_MASK;
 	all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) <<
 		(frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS);
diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h
index 27684135bb4d..b392559c33e9 100644
--- a/include/net/mana/gdma.h
+++ b/include/net/mana/gdma.h
@@ -224,7 +224,12 @@ struct gdma_dev {
 	struct auxiliary_device *adev;
 };

-#define MINIMUM_SUPPORTED_PAGE_SIZE PAGE_SIZE
+/* These are defined by HW */
+#define MANA_MIN_QSHIFT 12
+#define MANA_MIN_QSIZE (1 << MANA_MIN_QSHIFT)
+#define MANA_MIN_QALIGN(x) ALIGN((x), MANA_MIN_QSIZE)
+#define MANA_MIN_QALIGNED(addr) IS_ALIGNED((unsigned long)(addr), MANA_MIN_QSIZE)
+#define MANA_PFN(a) (PHYS_PFN(a) << (PAGE_SHIFT - MANA_MIN_QSHIFT))
See comments above about how this is defined.

Michael
quoted hunk ↗ jump to hunk
 #define GDMA_CQE_SIZE 64
 #define GDMA_EQE_SIZE 16
diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h
index 561f6719fb4e..43e8fc574354 100644
--- a/include/net/mana/mana.h
+++ b/include/net/mana/mana.h
@@ -42,7 +42,8 @@ enum TRI_STATE {

 #define MAX_SEND_BUFFERS_PER_QUEUE 256

-#define EQ_SIZE (8 * PAGE_SIZE)
+#define EQ_SIZE (8 * MANA_MIN_QSIZE)
+
 #define LOG2_EQ_THROTTLE 3

 #define MAX_PORTS_IN_MANA_DEV 256
--
2.34.1
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help