Thread (67 messages) 67 messages, 6 authors, 2020-01-07

Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN

From: John Hubbard <jhubbard@nvidia.com>
Date: 2019-12-29 04:33:42
Also in: bpf, dri-devel, kvm, linux-block, linux-doc, linux-fsdevel, linux-kselftest, linux-media, linux-mm, linux-rdma, lkml, netdev

On 12/27/19 1:56 PM, John Hubbard wrote:
...
quoted
It is ancient verification test (~10y) which is not an easy task to
make it understandable and standalone :).
Is this the only test that fails, btw? No other test failures or hints of
problems?

(Also, maybe hopeless, but can *anyone* on the RDMA list provide some
characterization of the test, such as how many pins per page, what page
sizes are used? I'm still hoping to write a test to trigger something
close to this...)

I do have a couple more ideas for test runs:

1. Reduce GUP_PIN_COUNTING_BIAS to 1. That would turn the whole override of
page->_refcount into a no-op, and so if all is well (it may not be!) with the
rest of the patch, then we'd expect this problem to not reappear.

2. Active /proc/vmstat *foll_pin* statistics unconditionally (just for these
tests, of course), so we can see if there is a get/put mismatch. However, that
will change the timing, and so it must be attempted independently of (1), in
order to see if it ends up hiding the repro.

I've updated this branch to implement (1), but not (2), hoping you can give
this one a spin?

    git@github.com:johnhubbard/linux.git  pin_user_pages_tracking_v11_with_diags
Also, looking ahead:

a) if the problem disappears with the latest above test, then we likely have
   a huge page refcount overflow, and there are a couple of different ways to
   fix it. 

b) if it still reproduces with the above, then it's some other random mistake,
   and in that case I'd be inclined to do a sort of guided (or classic, unguided)
   git bisect of the series. Because it could be any of several patches.

   If that's too much trouble, then I'd have to fall back to submitting a few
   patches at a time and working my way up to the tracking patch...


thanks,
-- 
John Hubbard
NVIDIA
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help