Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN

[PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 04/25] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
Re: [PATCH v11 04/25] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages · Kirill A. Shutemov <hidden> · 2019-12-18
Re: [PATCH v11 04/25] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages · John Hubbard <jhubbard@nvidia.com> · 2019-12-19
[PATCH v12] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages · John Hubbard <jhubbard@nvidia.com> · 2019-12-19
Re: [PATCH v11 04/25] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages · Dan Williams <hidden> · 2019-12-19
Re: [PATCH v11 04/25] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages · John Hubbard <jhubbard@nvidia.com> · 2019-12-19
Re: [PATCH v11 04/25] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages · Dan Williams <hidden> · 2019-12-19
Re: [PATCH v11 04/25] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages · John Hubbard <jhubbard@nvidia.com> · 2019-12-19
[PATCH v11 12/25] IB/{core,hw,umem}: set FOLL_PIN via pin_user_pages*(), fix up ODP · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 23/25] mm/gup: track FOLL_PIN pages · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v12 23/25] mm/gup: track FOLL_PIN pages · John Hubbard <jhubbard@nvidia.com> · 2019-12-17
[PATCH v11 22/25] mm, tree-wide: rename put_user_page*() to unpin_user_page*() · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 20/25] powerpc: book3s64: convert to pin_user_pages() and put_user_page() · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 25/25] selftests/vm: run_vmtests: invoke gup_benchmark with basic FOLL_PIN coverage · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 15/25] fs/io_uring: set FOLL_PIN via pin_user_pages() · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 03/25] mm: Cleanup __put_devmap_managed_page() vs ->page_free() · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 18/25] media/v4l2-core: pin_user_pages (FOLL_PIN) and put_user_page() conversion · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 19/25] vfio, mm: pin_user_pages (FOLL_PIN) and put_user_page() conversion · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 24/25] mm/gup_benchmark: support pin_user_pages() and related calls · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 11/25] goldish_pipe: convert to pin_user_pages() and put_user_page() · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 07/25] vfio: fix FOLL_LONGTERM use, simplify get_user_pages_remote() call · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 08/25] mm/gup: allow FOLL_FORCE for get_user_pages_fast() · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 17/25] media/v4l2-core: set pages dirty upon releasing DMA buffers · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 02/25] mm/gup: move try_get_compound_head() to top, fix minor issues · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 21/25] mm/gup_benchmark: use proper FOLL_WRITE flags instead of hard-coding "1" · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 14/25] drm/via: set FOLL_PIN via pin_user_pages_fast() · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 09/25] IB/umem: use get_user_pages_fast() to pin DMA pages · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 01/25] mm/gup: factor out duplicate code from four routines · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
Re: [PATCH v11 01/25] mm/gup: factor out duplicate code from four routines · Kirill A. Shutemov <hidden> · 2019-12-18
Re: [PATCH v11 01/25] mm/gup: factor out duplicate code from four routines · John Hubbard <jhubbard@nvidia.com> · 2019-12-18
Re: [PATCH v11 01/25] mm/gup: factor out duplicate code from four routines · Kirill A. Shutemov <hidden> · 2019-12-18
[PATCH v11 10/25] mm/gup: introduce pin_user_pages*() and FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 13/25] mm/process_vm_access: set FOLL_PIN via pin_user_pages_remote() · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 05/25] goldish_pipe: rename local pin_user_pages() routine · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 16/25] net/xdp: set FOLL_PIN via pin_user_pages() · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
[PATCH v11 06/25] mm: fix get_user_pages_remote()'s handling of FOLL_LONGTERM · John Hubbard <jhubbard@nvidia.com> · 2019-12-16
Re: [PATCH v11 06/25] mm: fix get_user_pages_remote()'s handling of FOLL_LONGTERM · Kirill A. Shutemov <hidden> · 2019-12-18
Re: [PATCH v11 06/25] mm: fix get_user_pages_remote()'s handling of FOLL_LONGTERM · John Hubbard <jhubbard@nvidia.com> · 2019-12-18
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Jan Kara <jack@suse.cz> · 2019-12-17
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Leon Romanovsky <leon@kernel.org> · 2019-12-19
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-19
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Jason Gunthorpe <jgg@ziepe.ca> · 2019-12-19
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-19
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Jason Gunthorpe <jgg@ziepe.ca> · 2019-12-20
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Dan Williams <hidden> · 2019-12-21
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Jason Gunthorpe <jgg@ziepe.ca> · 2019-12-23
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-19
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Leon Romanovsky <leon@kernel.org> · 2019-12-20
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-20
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Leon Romanovsky <leon@kernel.org> · 2019-12-20
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-20
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Leon Romanovsky <leon@kernel.org> · 2019-12-21
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-22
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Leon Romanovsky <leon@kernel.org> · 2019-12-22
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-25
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Leon Romanovsky <leon@kernel.org> · 2019-12-25
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-27
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-29
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Jan Kara <jack@suse.cz> · 2020-01-06
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2020-01-07
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Jan Kara <jack@suse.cz> · 2019-12-20
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-21
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Dan Williams <hidden> · 2019-12-21
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-21
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · Dan Williams <hidden> · 2019-12-21
Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN · John Hubbard <jhubbard@nvidia.com> · 2019-12-21

From: Dan Williams <hidden>
Date: 2019-12-21 00:32:33
Also in: bpf, dri-devel, kvm, linux-block, linux-doc, linux-fsdevel, linux-kselftest, linux-media, linux-mm, linux-rdma, lkml, netdev

On Fri, Dec 20, 2019 at 5:34 AM Jason Gunthorpe [off-list ref] wrote:

On Thu, Dec 19, 2019 at 01:13:54PM -0800, John Hubbard wrote:

quoted

On 12/19/19 1:07 PM, Jason Gunthorpe wrote:

quoted

On Thu, Dec 19, 2019 at 12:30:31PM -0800, John Hubbard wrote:

quoted

On 12/19/19 5:26 AM, Leon Romanovsky wrote:

quoted

On Mon, Dec 16, 2019 at 02:25:12PM -0800, John Hubbard wrote:

quoted

Hi,

This implements an API naming change (put_user_page*() -->
unpin_user_page*()), and also implements tracking of FOLL_PIN pages. It
extends that tracking to a few select subsystems. More subsystems will
be added in follow up work.

Hi John,

The patchset generates kernel panics in our IB testing. In our tests, we
allocated single memory block and registered multiple MRs using the single
block.

The possible bad flow is:
   ib_umem_geti() ->
    pin_user_pages_fast(FOLL_WRITE) ->
     internal_get_user_pages_fast(FOLL_WRITE) ->
      gup_pgd_range() ->
       gup_huge_pd() ->
        gup_hugepte() ->
         try_grab_compound_head() ->

Hi Leon,

Thanks very much for the detailed report! So we're overflowing...

At first look, this seems likely to be hitting a weak point in the
GUP_PIN_COUNTING_BIAS-based design, one that I believed could be deferred
(there's a writeup in Documentation/core-api/pin_user_page.rst, lines
99-121). Basically it's pretty easy to overflow the page->_refcount
with huge pages if the pages have a *lot* of subpages.

We can only do about 7 pins on 1GB huge pages that use 4KB subpages.

Considering that establishing these pins is entirely under user
control, we can't have a limit here.

There's already a limit, it's just a much larger one. :) What does "no limit"
really mean, numerically, to you in this case?

I guess I mean 'hidden limit' - hitting the limit and failing would
be managable.

I think 7 is probably too low though, but we are not using 1GB huge
pages, only 2M..

What about RDMA to 1GB-hugetlbfs and 1GB-device-dax mappings?

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help