Re: [PATCH 4/4] RDMA/umem: batch page unpin in __ib_mem_release()
From: John Hubbard <jhubbard@nvidia.com>
Date: 2021-02-04 00:16:54
Also in:
linux-mm, lkml
On 2/3/21 2:00 PM, Joao Martins wrote:
Use the newly added unpin_user_page_range_dirty_lock() for more quickly unpinning a consecutive range of pages represented as compound pages. This will also calculate number of pages to unpin (for the tail pages which matching head page) and thus batch the refcount update. Running a test program which calls mr reg/unreg on a 1G in size and measures cost of both operations together (in a guest using rxe) with THP and hugetlbfs:
In the patch subject line:
s/__ib_mem_release/__ib_umem_release/
quoted hunk ↗ jump to hunk
Before: 590 rounds in 5.003 sec: 8480.335 usec / round 6898 rounds in 60.001 sec: 8698.367 usec / round After: 2631 rounds in 5.001 sec: 1900.618 usec / round 31625 rounds in 60.001 sec: 1897.267 usec / round Signed-off-by: Joao Martins <redacted> --- drivers/infiniband/core/umem.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index 2dde99a9ba07..ea4ebb3261d9 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c@@ -47,17 +47,17 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int dirty) { - struct sg_page_iter sg_iter; - struct page *page; + bool make_dirty = umem->writable && dirty; + struct scatterlist *sg; + int i;
Maybe unsigned int is better, so as to perfectly match the scatterlist.length.
if (umem->nmap > 0)
ib_dma_unmap_sg(dev, umem->sg_head.sgl, umem->sg_nents,
DMA_BIDIRECTIONAL);
- for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->sg_nents, 0) {
- page = sg_page_iter_page(&sg_iter);
- unpin_user_pages_dirty_lock(&page, 1, umem->writable && dirty);
- }
+ for_each_sg(umem->sg_head.sgl, sg, umem->nmap, i)The change from umem->sg_nents to umem->nmap looks OK, although we should get IB people to verify that there is not some odd bug or reason to leave it as is.
+ unpin_user_page_range_dirty_lock(sg_page(sg), + DIV_ROUND_UP(sg->length, PAGE_SIZE), make_dirty);
Is it really OK to refer directly to sg->length? The scatterlist library goes to some effort to avoid having callers directly access the struct member variables. Actually, the for_each_sg() code and its behavior with sg->length and sg_page(sg) confuses me because I'm new to it, and I don't quite understand how this works. Especially with SG_CHAIN. I'm assuming that you've monitored /proc/vmstat for nr_foll_pin* ?
sg_free_table(&umem->sg_head); }
thanks, -- John Hubbard NVIDIA