Thread (54 messages) 54 messages, 10 authors, 2024-09-06

Re: [PATCH 00/13] fs/dax: Fix FS DAX page reference counts

From: Alistair Popple <apopple@nvidia.com>
Date: 2024-06-28 00:21:01
Also in: linux-arm-kernel, linux-cxl, linux-doc, linux-ext4, linux-fsdevel, linux-mm, linux-xfs, lkml, nvdimm

Dan Williams [off-list ref] writes:
Alistair Popple wrote:
quoted
Dan Williams [off-list ref] writes:
quoted
Alistair Popple wrote:
quoted
FS DAX pages have always maintained their own page reference counts
without following the normal rules for page reference counting. In
particular pages are considered free when the refcount hits one rather
than zero and refcounts are not added when mapping the page.

Tracking this requires special PTE bits (PTE_DEVMAP) and a secondary
mechanism for allowing GUP to hold references on the page (see
get_dev_pagemap). However there doesn't seem to be any reason why FS
DAX pages need their own reference counting scheme.

By treating the refcounts on these pages the same way as normal pages
we can remove a lot of special checks. In particular pXd_trans_huge()
becomes the same as pXd_leaf(), although I haven't made that change
here. It also frees up a valuable SW define PTE bit on architectures
that have devmap PTE bits defined.

It also almost certainly allows further clean-up of the devmap managed
functions, but I have left that as a future improvment.

This is an update to the original RFC rebased onto v6.10-rc5. Unlike
the original RFC it passes the same number of ndctl test suite
(https://github.com/pmem/ndctl) tests as my current development
environment does without these patches.
Are you seeing the 'mmap.sh' test fail even without these patches?
No. But I also don't see it failing with these patches :)

For reference this is what I see on my test machine with or without:

[1/70] Generating version.h with a custom command
 1/13 ndctl:dax / daxdev-errors.sh          SKIP             0.06s   exit status 77
 2/13 ndctl:dax / multi-dax.sh              SKIP             0.05s   exit status 77
 3/13 ndctl:dax / sub-section.sh            SKIP             0.14s   exit status 77
I really need to get this test built as a service as this shows a
pre-req is missing, and it's not quite fair to expect submitters to put
it all together.
Ok. I didn't dig into why this was being skipped but I might if I find
some time. The rest of the tests seemed more relevant anyway and turned
up enough bugs with my initial implementation to keep me busy which gave
me some confidence.

If I'm being honest though I found the whole test setup a bit of a
pain. In particular remembering you have to manually (re)build the
special test versions of the modules tripped me up a few times until I
updated my build scripts. But I got there in the end.
quoted
 4/13 ndctl:dax / dax-dev                   OK               0.02s
 5/13 ndctl:dax / dax-ext4.sh               OK              12.97s
 6/13 ndctl:dax / dax-xfs.sh                OK              12.44s
 7/13 ndctl:dax / device-dax                OK              13.40s
 8/13 ndctl:dax / revoke-devmem             FAIL             0.31s   (exit status 250 or signal 122 SIGinvalid)
quoted
quoted
quoted
TEST_PATH=/home/apopple/ndctl/build/test LD_LIBRARY_PATH=/home/apopple/ndctl/build/cxl/lib:/home/apopple/ndctl/build/daxctl/lib:/home/apopple/ndctl/build/ndctl/lib NDCTL=/home/apopple/ndctl/build/ndctl/ndctl MALLOC_PERTURB_=227 DATA_PATH=/home/apopple/ndctl/test DAXCTL=/home/apopple/ndctl/build/daxctl/daxctl /home/apopple/ndctl/build/test/revoke_devmem
 9/13 ndctl:dax / device-dax-fio.sh         OK              32.43s
10/13 ndctl:dax / daxctl-devices.sh         SKIP             0.07s   exit status 77
11/13 ndctl:dax / daxctl-create.sh          SKIP             0.04s   exit status 77
12/13 ndctl:dax / dm.sh                     FAIL             0.08s   exit status 1
quoted
quoted
quoted
MALLOC_PERTURB_=209 TEST_PATH=/home/apopple/ndctl/build/test LD_LIBRARY_PATH=/home/apopple/ndctl/build/cxl/lib:/home/apopple/ndctl/build/daxctl/lib:/home/apopple/ndctl/build/ndctl/lib NDCTL=/home/apopple/ndctl/build/ndctl/ndctl DATA_PATH=/home/apopple/ndctl/test DAXCTL=/home/apopple/ndctl/build/daxctl/daxctl /home/apopple/ndctl/test/dm.sh
13/13 ndctl:dax / mmap.sh                   OK             107.57s
I need to think through why this one might false succeed, but that can
wait until we get this series reviewed. For now my failure is stable
which allows it to be bisected.
quoted
Ok:                 6   
Expected Fail:      0   
Fail:               2   
Unexpected Pass:    0   
Skipped:            5   
Timeout:            0   

I have been using QEMU for my testing. Maybe I missed some condition in
the unmap path though so will take another look.
I was able to bisect to:
I could have guessed that one, as it's pretty much the crux of this
series given it's the one that switches everything away from
pXX_devmap. That means pXX_leaf/_trans_huge will start returning true
for DAX pages.

Based on your dump I'm guessing I missed some case in the
zap_pXX_range() path. It could be helpful to narrow down which of the
pXX paths is crashing but I will take another look there.
[PATCH 10/13] fs/dax: Properly refcount fs dax pages

...I will prioritize that one in my review queue.
Thanks!
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help