Re: [PATCH] scsi: scsi_host_queue_ready: increase busy count early

From: Roger Willcocks <hidden>
Date: 2021-02-22 14:37:02

FYI we have exactly this issue on a machine here running CentOS 8.3 (kernel 4.18.0-240.1.1) (so presumably this happens in RHEL 8 too.)

Controller is MSCC / Adaptec 3154-8i16e driving 60 x 12TB HGST drives configured as five x twelve-drive raid-6, software striped using md, and formatted with xfs.

Test software writes to the array using multiple threads in parallel.

The smartpqi driver would report controller offline within ten minutes or so, with status code 0x6100c

Changed the driver to set 'nr_hw_queues = 1’ and then tested by filling the array with random files (which took a couple of days), which completed fine, so it looks like that one-line change fixes it.

Would, of course, be helpful if this was back-ported.

—
Roger

On 3 Feb 2021, at 15:56, Don.Brace@microchip.com wrote:

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH] scsi: scsi_host_queue_ready: increase busy count early

quoted


Confirmed my suspicions - it looks like the host is sent more commands 
than it can handle. We would need many disks to see this issue though, 
which you have.

So for stable kernels, 6eb045e092ef is not in 5.4 . Next is 5.10, and 
I suppose it could be simply fixed by setting .host_tagset in scsi 
host template there.

Thanks,
John
--
Don: Even though this works for current kernels, what would chances of 
this getting back-ported to 5.9 or even further?

Otherwise the original patch smartpqi_fix_host_qdepth_limit would 
correct this issue for older kernels.

True. However this is 5.12 material, so we shouldn't be bothered by that here. For 5.5 up to 5.9, you need a workaround. But I'm unsure whether smartpqi_fix_host_qdepth_limit would be the solution.
You could simply divide can_queue by nr_hw_queues, as suggested before, or even simpler, set nr_hw_queues = 1.

How much performance would that cost you?

Don: For my HBA disk tests...

Dividing can_queue / nr_hw_queues is about a 40% drop.
~380K - 400K IOPS
Setting nr_hw_queues = 1 results in a 1.5 X gain in performance.
~980K IOPS
Setting host_tagset = 1
~640K IOPS

So, it seem that setting nr_hw_queues = 1 results in the best performance.

Is this expected? Would this also be true for the future?

Thanks,
Don Brace

Below is my setup.
---
[3:0:0:0]    disk    HP       EG0900FBLSK      HPD7  /dev/sdd 
[3:0:1:0]    disk    HP       EG0900FBLSK      HPD7  /dev/sde 
[3:0:2:0]    disk    HP       EG0900FBLSK      HPD7  /dev/sdf 
[3:0:3:0]    disk    HP       EH0300FBQDD      HPD5  /dev/sdg 
[3:0:4:0]    disk    HP       EG0900FDJYR      HPD4  /dev/sdh 
[3:0:5:0]    disk    HP       EG0300FCVBF      HPD9  /dev/sdi 
[3:0:6:0]    disk    HP       EG0900FBLSK      HPD7  /dev/sdj 
[3:0:7:0]    disk    HP       EG0900FBLSK      HPD7  /dev/sdk 
[3:0:8:0]    disk    HP       EG0900FBLSK      HPD7  /dev/sdl 
[3:0:9:0]    disk    HP       MO0200FBRWB      HPD9  /dev/sdm 
[3:0:10:0]   disk    HP       MM0500FBFVQ      HPD8  /dev/sdn 
[3:0:11:0]   disk    ATA      MM0500GBKAK      HPGC  /dev/sdo 
[3:0:12:0]   disk    HP       EG0900FBVFQ      HPDC  /dev/sdp 
[3:0:13:0]   disk    HP       VO006400JWZJT    HP00  /dev/sdq 
[3:0:14:0]   disk    HP       VO015360JWZJN    HP00  /dev/sdr 
[3:0:15:0]   enclosu HP       D3700            5.04  -        
[3:0:16:0]   enclosu HP       D3700            5.04  -        
[3:0:17:0]   enclosu HPE      Smart Adapter    3.00  -        
[3:1:0:0]    disk    HPE      LOGICAL VOLUME   3.00  /dev/sds 
[3:2:0:0]    storage HPE      P408e-p SR Gen10 3.00  -        
-----
[global]
ioengine=libaio
; rw=randwrite
; percentage_random=40
rw=write
size=100g
bs=4k
direct=1
ramp_time=15
; filename=/mnt/fio_test
; cpus_allowed=0-27
iodepth=4096

[/dev/sdd]
[/dev/sde]
[/dev/sdf]
[/dev/sdg]
[/dev/sdh]
[/dev/sdi]
[/dev/sdj]
[/dev/sdk]
[/dev/sdl]
[/dev/sdm]
[/dev/sdn]
[/dev/sdo]
[/dev/sdp]
[/dev/sdq]
[/dev/sdr]


Distribution kernels would be yet another issue, distros can backport host_tagset and get rid of the issue.

Regards
Martin

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help