Thread (43 messages) 43 messages, 7 authors, 2021-08-20

Re: [PATCH v6 08/13] mm: call pgmap->ops->page_free for DEVICE_GENERIC pages

From: Jerome Glisse <hidden>
Date: 2021-08-20 07:24:37
Also in: amd-gfx, dri-devel, linux-ext4, linux-xfs

On Thu, Aug 19, 2021 at 10:05 PM Christoph Hellwig [off-list ref] wrote:
On Tue, Aug 17, 2021 at 11:44:54AM -0400, Felix Kuehling wrote:
quoted
quoted
quoted
That's a good catch. Existing drivers shouldn't need a page_free
callback if they didn't have one before. That means we need to add a
NULL-pointer check in free_device_page.
Also the other state clearing (__ClearPageWaiters/mem_cgroup_uncharge/
->mapping = NULL).

In many ways this seems like you want to bring back the DEVICE_PUBLIC
pgmap type that was removed a while ago due to the lack of users
instead of overloading the generic type.
I think so. I'm not clear about how DEVICE_PUBLIC differed from what
DEVICE_GENERIC is today. As I understand it, DEVICE_PUBLIC was removed
because it was unused and also known to be broken in some ways.
DEVICE_GENERIC seemed close enough to what we need, other than not being
supported in the migration helpers.

Would you see benefit in re-introducing DEVICE_PUBLIC as a distinct
memory type from DEVICE_GENERIC? What would be the benefits of making
that distinction?
The old DEVICE_PUBLIC mostly different in that it allowed the page
to be returned from vm_normal_page, which I think was horribly buggy.
Why was that buggy ? If I were to do it now, i would return
DEVICE_PUBLIC page from vm_normal_page but i would ban pinning as
pinning is exceptionally wrong for GPU. If you migrate some random
anonymous/file back to your GPU memory and it gets pinned there then
there is no way for the GPU to migrate the page out. Quickly you will
run out of physically contiguous memory and things like big graphic
buffer allocation (anything that needs physically contiguous memory)
will fail. It is less of an issue on some hardware that rely less and
less on physically contiguous memory but i do not think it is
completely gone from all hw.
But the point is not to bring back these old semantics.  The idea
is to be able to differeniate between your new coherent on-device
memory and the existing DEVICE_GENERIC.  That is call the
code in free_devmap_managed_page that is currently only used
for device private pages also for your new public device pages without
affecting the devdax and xen use cases.
Yes, I would rather bring back DEVICE_PUBLIC then try to use
DEVICE_GENERIC, the GENERIC change was done for users that closely
matched DAX semantics and it is not the case here, at least not from
my point of view.

Jerome
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help