Re: DIF/DIX issue related to config CONFIG_SCSI_MQ_DEFAULT
From: chenxiang (M) <hidden>
Date: 2018-11-28 03:37:37
Hi Lei Ming, 在 2018/11/27 21:08, Ming Lei 写道:
On Tue, Nov 27, 2018 at 05:55:45PM +0800, chenxiang (M) wrote:quoted
Hi all, There is a issue which may be related to CONFIG_SCSI_MQ_DEFAULT: before we developed DIF/DIX feature on kernel 4.18 (disable CONFIG_SCSI_MQ_DEFAULT default), and it works well.I guess you are testing hisi_sas_v3_hw, does 4.18 work with 'scsi_mod.use_blk_mq=Y'? If yes, you may run 'git bisect' to figure out which commit is the 1st bad one.quoted
But when we switch to kernel 4.19-rc1 and 4.20-rc1, Call trace as follow occurs when running fio and if disable config CONFIG_SCSI_MQ_DEFAULT, then it works well. Also if switch ioengine=libaio to ioengine=psync, it seems also work well. Do you have any idea or encounter similar issue?I tested scsi-debug via 'dix=1 dif=1', looks everything is fine, are you using direct io or not?quoted
job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 fio 2.0.5 Starting 12 processes [ 629.210506] Unable to handle kernel paging request at virtual address 0000ffff8027e048 [ 629.210506] Unable to handle kernel paging request at virtual address 0000ffff8027e048 [ 629.226373] Mem abort info: [ 629.226373] Mem abort info: [ 629.231952] ESR = 0x96000006 [ 629.231952] ESR = 0x96000006 [ 629.238052] Exception class = DABT (current EL), IL = 32 bits [ 629.238052] Exception class = DABT (current EL), IL = 32 bits [ 629.249898] SET = 0, FnV = 0 [ 629.249898] SET = 0, FnV = 0 [ 629.255998] EA = 0, S1PTW = 0 [ 629.255998] EA = 0, S1PTW = 0 [ 629.262272] Data abort info: [ 629.262272] Data abort info: [ 629.268023] ISV = 0, ISS = 0x00000006 [ 629.268023] ISV = 0, ISS = 0x00000006 [ 629.275690] CM = 0, WnR = 0 [ 629.275690] CM = 0, WnR = 0 [ 629.281617] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000085c91728 [ 629.281617] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000085c91728 [ 629.294857] [0000ffff8027e048] pgd=00000027a8644003, pud=00000027a85ea003, pmd=0000000000000000 [ 629.294857] [0000ffff8027e048] pgd=00000027a8644003, pud=00000027a85ea003, pmd=0000000000000000 [ 629.312278] Internal error: Oops: 96000006 [#1] PREEMPT SMP [ 629.312278] Internal error: Oops: 96000006 [#1] PREEMPT SMP [ 629.323427] Modules linked in: hisi_sas_v3_hw [last unloaded: hisi_sas_v3_hw] [ 629.323427] Modules linked in: hisi_sas_v3_hw [last unloaded: hisi_sas_v3_hw] [ 629.337713] CPU: 13 PID: 4465 Comm: fio Not tainted 4.20.0-rc1-15093-ge876dec #1067 [ 629.337713] CPU: 13 PID: 4465 Comm: fio Not tainted 4.20.0-rc1-15093-ge876dec #1067 [ 629.353040] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 - B601 (V6.01) 11/08/2018 [ 629.353040] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 - B601 (V6.01) 11/08/2018 [ 629.370633] pstate: 80400009 (Nzcv daif +PAN -UAO) [ 629.370633] pstate: 80400009 (Nzcv daif +PAN -UAO) [ 629.380218] pc : deadline_remove_request+0x2c/0xd0 [ 629.380218] pc : deadline_remove_request+0x2c/0xd0Could you use gdb to find where 'deadline_remove_request+0x2c' points to?
From objdump, 'deadline_remove_request+0x2c' is on the function __list_del -> INIT_LIST_HEAD.
Thanks, Ming .