Re: [PATCH 9/9] nvme: wire up completion batching for the IRQ path

[PATCHSET 0/9] Batched completions · Jens Axboe <axboe@kernel.dk> · 2021-10-12
[PATCH 1/9] block: add a struct io_batch argument to fops->iopoll() · Jens Axboe <axboe@kernel.dk> · 2021-10-12
Re: [PATCH 1/9] block: add a struct io_batch argument to fops->iopoll() · Bart Van Assche <bvanassche@acm.org> · 2021-10-12
Re: [PATCH 1/9] block: add a struct io_batch argument to fops->iopoll() · Jens Axboe <axboe@kernel.dk> · 2021-10-12
[PATCH 3/9] sbitmap: test bit before calling test_and_set_bit() · Jens Axboe <axboe@kernel.dk> · 2021-10-12
[PATCH 5/9] nvme: move the fast path nvme error and disposition helpers · Jens Axboe <axboe@kernel.dk> · 2021-10-12
Re: [PATCH 5/9] nvme: move the fast path nvme error and disposition helpers · Christoph Hellwig <hch@infradead.org> · 2021-10-13
Re: [PATCH 5/9] nvme: move the fast path nvme error and disposition helpers · Jens Axboe <axboe@kernel.dk> · 2021-10-13
Re: [PATCH 5/9] nvme: move the fast path nvme error and disposition helpers · Christoph Hellwig <hch@infradead.org> · 2021-10-13
[PATCH 7/9] block: assign batch completion handler in blk_poll() · Jens Axboe <axboe@kernel.dk> · 2021-10-12
[PATCH 6/9] nvme: add support for batched completion of polled IO · Jens Axboe <axboe@kernel.dk> · 2021-10-12
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · Christoph Hellwig <hch@infradead.org> · 2021-10-13
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · Jens Axboe <axboe@kernel.dk> · 2021-10-13
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · Christoph Hellwig <hch@infradead.org> · 2021-10-13
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · Jens Axboe <axboe@kernel.dk> · 2021-10-13
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · Jens Axboe <axboe@kernel.dk> · 2021-10-13
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · Christoph Hellwig <hch@infradead.org> · 2021-10-13
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · Jens Axboe <axboe@kernel.dk> · 2021-10-13
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · Christoph Hellwig <hch@infradead.org> · 2021-10-13
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · Jens Axboe <axboe@kernel.dk> · 2021-10-13
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · Jens Axboe <axboe@kernel.dk> · 2021-10-13
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · John Garry <hidden> · 2021-10-13
Re: [PATCH 6/9] nvme: add support for batched completion of polled IO · Jens Axboe <axboe@kernel.dk> · 2021-10-13
[PATCH 8/9] io_uring: utilize the io_batch infrastructure for more efficient polled IO · Jens Axboe <axboe@kernel.dk> · 2021-10-12
[PATCH 9/9] nvme: wire up completion batching for the IRQ path · Jens Axboe <axboe@kernel.dk> · 2021-10-12
Re: [PATCH 9/9] nvme: wire up completion batching for the IRQ path · Christoph Hellwig <hch@infradead.org> · 2021-10-13
Re: [PATCH 9/9] nvme: wire up completion batching for the IRQ path · Jens Axboe <axboe@kernel.dk> · 2021-10-13
[PATCH 2/9] sbitmap: add helper to clear a batch of tags · Jens Axboe <axboe@kernel.dk> · 2021-10-12
Re: [PATCH 2/9] sbitmap: add helper to clear a batch of tags · Bart Van Assche <bvanassche@acm.org> · 2021-10-12
Re: [PATCH 2/9] sbitmap: add helper to clear a batch of tags · Jens Axboe <axboe@kernel.dk> · 2021-10-12
[PATCH 4/9] block: add support for blk_mq_end_request_batch() · Jens Axboe <axboe@kernel.dk> · 2021-10-12
Re: [PATCH 4/9] block: add support for blk_mq_end_request_batch() · Bart Van Assche <bvanassche@acm.org> · 2021-10-12
Re: [PATCH 4/9] block: add support for blk_mq_end_request_batch() · Jens Axboe <axboe@kernel.dk> · 2021-10-12

From: Jens Axboe <axboe@kernel.dk>
Date: 2021-10-13 15:04:44

On 10/13/21 1:12 AM, Christoph Hellwig wrote:

On Tue, Oct 12, 2021 at 12:17:42PM -0600, Jens Axboe wrote:

quoted

Trivial to do now, just need our own io_batch on the stack and pass that
in to the usual command completion handling.

I pondered making this dependent on how many entries we had to process,
but even for a single entry there's no discernable difference in
performance or latency. Running a sync workload over io_uring:

t/io_uring -b512 -d1 -s1 -c1 -p0 -F1 -B1 -n2 /dev/nvme1n1 /dev/nvme2n1

yields the below performance before the patch:

IOPS=254820, BW=124MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=251174, BW=122MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=250806, BW=122MiB/s, IOS/call=1/1, inflight=(1 1)

and the following after:

IOPS=255972, BW=124MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=251920, BW=123MiB/s, IOS/call=1/1, inflight=(1 1)
IOPS=251794, BW=122MiB/s, IOS/call=1/1, inflight=(1 1)

which definitely isn't slower, about the same if you factor in a bit of
variance. For peak performance workloads, benchmarking shows a 2%
improvement.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/nvme/host/pci.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4713da708cd4..fb3de6f68eb1 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c

@@ -1076,8 +1076,10 @@ static inline void nvme_update_cq_head(struct nvme_queue *nvmeq)
 
 static inline int nvme_process_cq(struct nvme_queue *nvmeq)
 {
+	struct io_batch ib;
 	int found = 0;
 
+	ib.req_list = NULL;

Is this really more efficient than

	struct io_batch ib = { };

Probably not. I could add a DEFINE_IO_BATCH() helper, would make it easier if
other kinds of init is ever needed.

-- 
Jens Axboe

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help