Re: [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity()y

From: Nick Child <nnac123@linux.ibm.com>
Date: 2025-01-08 14:09:16
Also in: linuxppc-dev, lkml

On Tue, Jan 07, 2025 at 03:04:40PM -0800, Yury Norov wrote:

On Tue, Jan 07, 2025 at 02:43:01PM -0800, Yury Norov wrote:

quoted

On Tue, Jan 07, 2025 at 04:37:17PM -0600, Nick Child wrote:

quoted

On Sat, Dec 28, 2024 at 10:49:35AM -0800, Yury Norov wrote:

quoted

A loop based on cpumask_next_wrap() opencodes the dedicated macro
for_each_online_cpu_wrap(). Using the macro allows to avoid setting
bits affinity mask more than once when stride >= num_online_cpus.

This also helps to drop cpumask handling code in the caller function.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index e95ae0d39948..4cfd90fb206b 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c

@@ -234,11 +234,16 @@ static int ibmvnic_set_queue_affinity(struct ibmvnic_sub_crq_queue *queue,
 		(*stragglers)--;
 	}
 	/* atomic write is safer than writing bit by bit directly */
-	for (i = 0; i < stride; i++) {
-		cpumask_set_cpu(*cpu, mask);
-		*cpu = cpumask_next_wrap(*cpu, cpu_online_mask,
-					 nr_cpu_ids, false);
+	for_each_online_cpu_wrap(i, *cpu) {
+		if (!stride--)
+			break;
+		cpumask_set_cpu(i, mask);
 	}
+
+	/* For the next queue we start from the first unused CPU in this queue */
+	if (i < nr_cpu_ids)
+		*cpu = i + 1;
+

This should read '*cpu = i'. Since the loop breaks after incrementing i.
Thanks!

cpumask_next_wrap() makes '+ 1' for you. The for_each_cpu_wrap() starts
exactly where you point. So, this '+1' needs to be explicit now.

Does that make sense?

Ah, I think I see what you mean. It should be like this, right?

  for_each_online_cpu_wrap(i, *cpu) {
  	if (!stride--) {
        	*cpu = i + 1;
  		break;
        }
  	cpumask_set_cpu(i, mask);
  }

Not quite, for_each_online_cpu_wrap will increment i to point to the
next online cpu, then enter the body of the loop. When we break (beacuse
stride is zero), we exit the loop early before i is added to any mask, i
is the next unassigned online cpu.
I tested this to make sure, we see unused cpus (#7, #23)  with the patch as is:
  IRQ : 256 -> ibmvnic-30000003-tx0
	/proc/irq/256/smp_affinity_list:0-6
  IRQ : 257 -> ibmvnic-30000003-tx1
	/proc/irq/257/smp_affinity_list:16-22
  IRQ : 258 -> ibmvnic-30000003-rx0
	/proc/irq/258/smp_affinity_list:8-14
  IRQ : 259 -> ibmvnic-30000003-rx1
	/proc/irq/259/smp_affinity_list:24-30

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help