Re: [PATCH v4 19/25] proc/task_mmu: Ignore ZONE_DEVICE pages
From: Alistair Popple <apopple@nvidia.com>
Date: 2025-01-06 06:43:31
Also in:
linux-arm-kernel, linux-cxl, linux-doc, linux-ext4, linux-fsdevel, linux-mm, linux-xfs, lkml, nvdimm
On Fri, Dec 20, 2024 at 07:32:52PM +0100, David Hildenbrand wrote:
On 19.12.24 00:11, Alistair Popple wrote:quoted
On Tue, Dec 17, 2024 at 11:31:25PM +0100, David Hildenbrand wrote:quoted
On 17.12.24 06:13, Alistair Popple wrote:quoted
The procfs mmu files such as smaps currently ignore device dax and fs dax pages because these pages are considered special. To maintain existing behaviour once these pages are treated as normal pages and returned from vm_normal_page() add tests to explicitly skip them. Signed-off-by: Alistair Popple <apopple@nvidia.com> --- fs/proc/task_mmu.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 38a5a3e..c9b227a 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c@@ -801,6 +801,8 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, if (pte_present(ptent)) { page = vm_normal_page(vma, addr, ptent); + if (page && (is_device_dax_page(page) || is_fsdax_page(page)))This "is_device_dax_page(page) || is_fsdax_page(page)" is a common theme here, likely we should have a special helper?Sounds good, will add is_dax_page() if there are enough callers left after any review comments.:)
In the end there was only a single caller so I will leave this open-coded.
quoted
quoted
But, don't we actually want to include them in the smaps output now? I think we want.I'm not an expert in what callers of vm_normal_page() think of as a "normal" page.Yeah, it's tricky. It means "this is abnormal, don't look at the struct page". We're moving away from that, such that these folios/pages will be ... mostly normal :)quoted
So my philosphy here was to ensure anything calling vm_normal_page() didn't accidentally start seeing DAX pages, either by checking existing filters (lots of callers already call vma_is_special_huge() or some equivalent) or explicitly filtering them out in the hope someone smarter than me could tell me it was unneccssary. That stategy seems to have worked, and so I agree we likely do want them in smaps. I just didn't want to silently do it without this kind of discussion first.Yes, absolutely.quoted
quoted
The rmap code will indicate these pages in /proc/meminfo, per-node info, in the memcg ... as "Mapped:" etc. So likely we just want to also indicate them here, or is there any downsides we know of?I don't know of any, and I think it makes sense to also indicate them so will drop this check in the respin.It will be easy to hide them later, at least we talked about it. Thanks for doing all this!
Not a problem. The other main thing in this patch is also hiding them from /proc/<PID>/pagemap. Based on this discussion I can't think of any good reason why we would want to hide them there so will also remove the checks in the pagemap walker.
-- Cheers, David / dhildenb