Thread (4 messages) 4 messages, 3 authors, 2019-08-22

RE: [Patch v2] storvsc: setup 1:1 mapping between hardware queue and CPU queue

From: Long Li <longli@microsoft.com>
Date: 2019-08-22 22:29:23
Also in: linux-scsi, lkml

quoted
quoted
Subject: RE: [Patch v2] storvsc: setup 1:1 mapping between hardware
queue and CPU queue
quoted
quoted
quoted
Subject: RE: [Patch v2] storvsc: setup 1:1 mapping between hardware
queue and CPU queue

From: Long Li <redacted> Sent: Thursday, August 22,
2019
1:42 PM
quoted
storvsc doesn't use a dedicated hardware queue for a given CPU
queue.
quoted
quoted
quoted
quoted
When issuing I/O, it selects returning CPU (hardware queue)
dynamically based on vmbus channel usage across all channels.

This patch advertises num_possible_cpus() as number of hardware
queues. This will have upper layer setup 1:1 mapping between
hardware queue and CPU queue and avoid unnecessary locking when
issuing I/O.
quoted
quoted
quoted
quoted
Changes:
v2: rely on default upper layer function to map queues. (suggested
by Ming Lei
[off-list ref])

Signed-off-by: Long Li <longli@microsoft.com>
---
 drivers/scsi/storvsc_drv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/scsi/storvsc_drv.c
b/drivers/scsi/storvsc_drv.c index b89269120a2d..dfd3b76a4f89
100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1836,8 +1836,7 @@ static int storvsc_probe(struct hv_device
*device,
quoted
 	/*
 	 * Set the number of HW queues we are supporting.
 	 */
-	if (stor_device->num_sc != 0)
-		host->nr_hw_queues = stor_device->num_sc + 1;
+	host->nr_hw_queues = num_possible_cpus();
For a lot of the VM sizes in Azure, num_possible_cpus() is 128, even
if the VM has only 4 or 8 or some other smaller number of vCPUs.
So I'm wondering if you really want num_present_cpus() here instead,
which would include only the vCPUs that actually exist in the VM.
I think reporting num_possible_cpus() doesn't do more harm or take more
resources. Because block layer allocates map for all the possible CPUs.

The actual mapping is done in blk_mq_map_queues(), and it iterates all the
possible CPUs. If we report num_present_cpus(), the rest of the CPUs also
need to be mapped.
Actually I get your point, reporting num_present_cpus() will get less number of struct blk_mq_hw_ctx created. So it saves memory.

If we don't plan to support adding/onlining CPUs, we should use num_present_cpus().
quoted
quoted
quoted
quoted
quoted
Michael
quoted
 	/*
 	 * Set the error handler work queue.
--
2.17.1
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help