RE: [bug report] Hang on sync after dd
From: Kashyap Desai <kashyap.desai@broadcom.com>
Date: 2020-12-01 10:28:08
Also in:
linux-scsi
@Kashyap, have you guys tested megaraid sas much for this?
John - I tested V4 version "scsi: core: Only re-run queue in scsi_end_request() if device queue is busy" on MR controller. I used different reduced device queue depth (1 to 16). I can try the exact same test case with MR controller.
Thanks, John Block debugfs info is as follows: estuary:/sys/kernel/debug/block/sda/hctx8$ cat active cpu101/ cpu96/ cpu99/ dispatch_busy io_poll sched_tags tags busy cpu102/ cpu97/ ctx_map dispatched queued sched_tags_bitmap tags_bitmap cpu100/ cpu103/ cpu98/ dispatch flags run state type estuary:/sys/kernel/debug/block/sda/hctx8$ cat cpu cpu100/ cpu101/ cpu102/ cpu103/ cpu96/ cpu97/ cpu98/ cpu99/ estuary:/sys/kernel/debug/block/sda/hctx8$ cat cpu cpu100/ cpu101/ cpu102/ cpu103/ cpu96/ cpu97/ cpu98/ cpu99/ estuary:/sys/kernel/debug/block/sda/hctx8$ cat cpu96/ completed default_rq_list dispatched merged poll_rq_list read_rq_list estuary:/sys/kernel/debug/block/sda/hctx8$ cat cpu96/dispatched 0 0 estuary:/sys/kernel/debug/block/sda/hctx8$ cat cpu97/dispatched 0 0 estuary:/sys/kernel/debug/block/sda/hctx8$ cat cpu98/dispatched 0 0 estuary:/sys/kernel/debug/block/sda/hctx8$ cat cpu99/dispatched 0 0 estuary:/sys/kernel/debug/block/sda/hctx8$ cat cpu100/dispatched 3 0 estuary:/sys/kernel/debug/block/sda/hctx8$ cat cpu100/completed 2 0 estuary:/sys/kernel/debug/block/sda/hctx8$ estuary:/sys/kernel/debug/block/sda/hctx8$ estuary:/sys/kernel/debug/block/sda/hctx8$ cat state SCHED_RESTART
When I tested V3 "scsi: core: Only re-run queue in scsi_end_request() if device queue is busy". I noticed the similar hang and that was fixed in V4 (final patch). Let me try on MR controller one more time. Hctx state SCHED_RESTART indicates that someone should kicked-off h/w queue but it was missed. It may be possible that When you revert " scsi: core: Only re-run queue in scsi_end_request() if device queue is busy", actual race condition windows narrows and it may be actually existing hidden issue.
estuary:/sys/kernel/debug/block/sda/hctx8$ ls active cpu101 cpu96 cpu99
dispatch_busy io_poll sched_tags tags busy cpu102 cpu97 ctx_map
dispatched queued sched_tags_bitmap tags_bitmap
cpu100 cpu103 cpu98 dispatch flags run state type
estuary:/sys/kernel/debug/block/sda/hctx8$ cat dispatch 000000007abb596e
{.op=FLUSH, .cmd_flags=PREFLUSH,
.rq_flags=FLUSH_SEQ|MQ_INFLIGHT|DONTPREP, .state=idle, .tag=21,
.internal_tag=-1, .cmd=opcode=0x35 35 00 00 00 00 00 00 00 00 00,
.retries=0, .result = 0x0, .flags=TAGGED|INITIALIZED|3, .timeout=60.000,If this issue is reproducible, can you check pending commands. Is there any pattern in pending command ?
allocated 2208.876 s ago} estuary:/sys/kernel/debug/block/sda/hctx8$ On cpu100, it seems completed is less than number dispatched.
Attachments
- smime.p7s [application/pkcs7-signature] 4169 bytes