Thread (33 messages) 33 messages, 4 authors, 2021-10-13

Re: [PATCH 6/9] nvme: add support for batched completion of polled IO

From: Jens Axboe <axboe@kernel.dk>
Date: 2021-10-13 15:10:05

On 10/13/21 1:08 AM, Christoph Hellwig wrote:
On Tue, Oct 12, 2021 at 12:17:39PM -0600, Jens Axboe wrote:
quoted
Take advantage of struct io_batch, if passed in to the nvme poll handler.
If it's set, rather than complete each request individually inline, store
them in the io_batch list. We only do so for requests that will complete
successfully, anything else will be completed inline as before.

Add an mq_ops->complete_batch() handler to do the post-processing of
the io_batch list once polling is complete.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/nvme/host/pci.c | 69 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 63 insertions(+), 6 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4ad63bb9f415..4713da708cd4 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -959,7 +959,7 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 	return ret;
 }
 
-static void nvme_pci_complete_rq(struct request *req)
+static void nvme_pci_unmap_rq(struct request *req)
 {
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
 	struct nvme_dev *dev = iod->nvmeq->dev;
@@ -969,9 +969,34 @@ static void nvme_pci_complete_rq(struct request *req)
 			       rq_integrity_vec(req)->bv_len, rq_data_dir(req));
 	if (blk_rq_nr_phys_segments(req))
 		nvme_unmap_data(dev, req);
+}
+
+static void nvme_pci_complete_rq(struct request *req)
+{
+	nvme_pci_unmap_rq(req);
 	nvme_complete_rq(req);
 }
 
+static void nvme_pci_complete_batch(struct io_batch *ib)
+{
+	struct request *req;
+
+	req = ib->req_list;
+	while (req) {
+		nvme_pci_unmap_rq(req);
+		if (req->rq_flags & RQF_SPECIAL_PAYLOAD)
+			nvme_cleanup_cmd(req);
+		if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
+				req_op(req) == REQ_OP_ZONE_APPEND)
+			req->__sector = nvme_lba_to_sect(req->q->queuedata,
+					le64_to_cpu(nvme_req(req)->result.u64));
+		req->status = nvme_error_status(nvme_req(req)->status);
+		req = req->rq_next;
+	}
+
+	blk_mq_end_request_batch(ib);
I hate all this open coding.  All the common logic needs to be in a
common helper.
I'll see if I can unify this a bit.
Also please add a for_each macro for the request list
iteration.  I already thought about that for the batch allocation in
for-next, but with ever more callers this becomes essential.
Added a prep patch with list helpers, current version is using those
now.
quoted
+	if (!nvme_try_complete_req(req, cqe->status, cqe->result)) {
+		enum nvme_disposition ret;
+
+		ret = nvme_decide_disposition(req);
+		if (unlikely(!ib || req->end_io || ret != COMPLETE)) {
+			nvme_pci_complete_rq(req);
This actually is the likely case as only polling ever does the batch
completion.  In doubt I'd prefer if we can avoid these likely/unlikely
annotations as much as possible.
If you look at the end of the series, IRQ is wired up for it too. But I
do agree with the unlikely, I generally dislike them too. I'll just kill
this one.
quoted
+		} else {
+			req->rq_next = ib->req_list;
+			ib->req_list = req;
+		}
And all this list manipulation should use proper helper.
Done
quoted
+	}
Also - can you look into turning this logic into an inline function with
a callback for the driver?  I think in general gcc will avoid the
indirect call for function pointers passed directly.  That way we can
keep a nice code structure but also avoid the indirections.

Same for nvme_pci_complete_batch.
Not sure I follow. It's hard to do a generic callback for this, as the
batch can live outside the block layer through the plug. That's why
it's passed the way it is in terms of completion hooks.

-- 
Jens Axboe
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help