Thread (8 messages) 8 messages, 4 authors, 2021-06-14

Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request

From: Sagi Grimberg <sagi@grimberg.me>
Date: 2021-06-08 17:44:17
Subsystem: nvm express target driver, the rest · Maintainers: Christoph Hellwig, Sagi Grimberg, Chaitanya Kulkarni, Linus Torvalds

Hi Christoph, Sagi,

We're testing some device error recovery scenarios and hit the following BUG, stack trace below.
In the error scenario, nvmet_rdma_queue_response receives an error from the device when trying to post a wr,

this leads to nvmet_rdma_release_rsp being called from softirq eventually
reaching the blk_mq_delay_run_hw_queue which tries to schedule in softirq. (full stack below)

could you please advise what the correct solution should be in this case ?
Hey Michal,

I agree this can happen and requires correction. Does the below resolve
the issue?

--
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 7d607f435e36..6d2eea322779 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -16,6 +16,7 @@
  #include <linux/wait.h>
  #include <linux/inet.h>
  #include <asm/unaligned.h>
+#include <linux/async.h>

  #include <rdma/ib_verbs.h>
  #include <rdma/rdma_cm.h>
@@ -712,6 +713,12 @@ static void nvmet_rdma_send_done(struct ib_cq *cq, 
struct ib_wc *wc)
         }
  }

+static void nvmet_rdma_async_release_rsp(void *data, async_cookie_t cookie)
+{
+       struct nvmet_rdma_rsp *rsp = data;
+       nvmet_rdma_release_rsp(rsp);
+}
+
  static void nvmet_rdma_queue_response(struct nvmet_req *req)
  {
         struct nvmet_rdma_rsp *rsp =
@@ -745,7 +752,12 @@ static void nvmet_rdma_queue_response(struct 
nvmet_req *req)

         if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) {
                 pr_err("sending cmd response failed\n");
-               nvmet_rdma_release_rsp(rsp);
+               /*
+                * We might be in atomic context, hence release
+                * the rsp in async context in case we need to
+                * process the wr_wait_list.
+                */
+               async_schedule(nvmet_rdma_async_release_rsp, rsp);
         }
  }
--
thanks,
Michal

[ 8790.082863] nvmet_rdma: post_recv cmd failed
[ 8790.083484] nvmet_rdma: sending cmd response failed
[ 8790.084131] ------------[ cut here ]------------
[ 8790.084140] WARNING: CPU: 7 PID: 46 at block/blk-mq.c:1422 __blk_mq_run_hw_queue+0xb7/0x100
[ 8790.084619] Modules linked in: null_blk nvmet_rdma nvmet nvme_rdma nvme_fabrics nvme_core netconsole qedr(OE) qede(OE) qed(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM nft_chain_nat xt_MASQUERADE nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nft_counter nft_compat tun bridge stp llc nf_tables nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_umad rpcrdma rdma_ucm ib_iser rdma_cm iw_cm intel_rapl_msr intel_rapl_common ib_cm sb_edac libiscsi scsi_transport_iscsi kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sunrpc rapl ib_uverbs ib_core cirrus drm_kms_helper drm virtio_balloon i2c_piix4 pcspkr crc32c_intel virtio_net serio_raw net_failover failover floppy crc8 ata_generic pata_acpi qemu_fw_cfg [last unloaded: qedr]
[ 8790.084748] CPU: 7 PID: 46 Comm: ksoftirqd/7 Tainted: G           OE     5.8.10 #1
[ 8790.084749] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
[ 8790.084752] RIP: 0010:__blk_mq_run_hw_queue+0xb7/0x100
[ 8790.084753] Code: 00 48 89 ef e8 ea 34 c8 ff 48 89 df 41 89 c4 e8 1f 7f 00 00 f6 83 a8 00 00 00 20 74 b1 41 f7 c4 fe ff ff ff 74 b7 0f 0b eb b3 <0f> 0b eb 86 48 83 bf 98 00 00 00 00 48 c7 c0 df 81 3f 82 48 c7 c2
[ 8790.084754] RSP: 0018:ffffc9000020ba60 EFLAGS: 00010206
[ 8790.084755] RAX: 0000000000000100 RBX: ffff88809fe8c400 RCX: 00000000ffffffff
[ 8790.084756] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88809fe8c400
[ 8790.084756] RBP: ffff888137b81a50 R08: ffffffffffffffff R09: 0000000000000020
[ 8790.084757] R10: 0000000000000001 R11: ffff8881365d4968 R12: 0000000000000000
[ 8790.084758] R13: ffff888137b81a40 R14: ffff88811e2b9e80 R15: ffff8880b3d964f0
[ 8790.084759] FS:  0000000000000000(0000) GS:ffff88813bbc0000(0000) knlGS:0000000000000000
[ 8790.084759] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8790.084760] CR2: 000055ca53900da8 CR3: 000000012b83e006 CR4: 0000000000360ee0
[ 8790.084763] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8790.084763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8790.084764] Call Trace:
[ 8790.084767]  __blk_mq_delay_run_hw_queue+0x140/0x160
[ 8790.084768]  blk_mq_get_tag+0x1d1/0x270
[ 8790.084771]  ? finish_wait+0x80/0x80
[ 8790.084773]  __blk_mq_alloc_request+0xb1/0x100
[ 8790.084774]  blk_mq_make_request+0x144/0x5d0
[ 8790.084778]  generic_make_request+0x2db/0x340
[ 8790.084779]  ? bvec_alloc+0x82/0xe0
[ 8790.084781]  submit_bio+0x43/0x160
[ 8790.084781]  ? bio_add_page+0x39/0x90
[ 8790.084794]  nvmet_bdev_execute_rw+0x28c/0x360 [nvmet]
[ 8790.084800]  nvmet_rdma_execute_command+0x72/0x110 [nvmet_rdma]
[ 8790.084802]  nvmet_rdma_release_rsp+0xc1/0x1e0 [nvmet_rdma]
[ 8790.084804]  nvmet_rdma_queue_response.cold.63+0x14/0x19 [nvmet_rdma]
[ 8790.084806]  nvmet_req_complete+0x11/0x40 [nvmet]
[ 8790.084809]  nvmet_bio_done+0x27/0x100 [nvmet]
[ 8790.084811]  blk_update_request+0x23e/0x3b0
[ 8790.084812]  blk_mq_end_request+0x1a/0x120
[ 8790.084814]  blk_done_softirq+0xa1/0xd0
[ 8790.084818]  __do_softirq+0xe4/0x2f8
[ 8790.084821]  ? sort_range+0x20/0x20
[ 8790.084824]  run_ksoftirqd+0x26/0x40
[ 8790.084825]  smpboot_thread_fn+0xc5/0x160
[ 8790.084827]  kthread+0x116/0x130
[ 8790.084828]  ? kthread_park+0x80/0x80
[ 8790.084832]  ret_from_fork+0x22/0x30
[ 8790.084833] ---[ end trace 16ec813ee3f82b56 ]---
[ 8790.085314] BUG: scheduling while atomic: ksoftirqd/7/46/0x00000100
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help