Thread (25 messages) 25 messages, 6 authors, 2019-03-21

Re: [PATCH 2/2] mm/dax: Don't enable huge dax mapping by default

From: Dan Williams <hidden>
Date: 2019-03-14 04:02:32
Also in: linux-mm, lkml, nvdimm

On Wed, Mar 13, 2019 at 8:45 PM Aneesh Kumar K.V
[off-list ref] wrote:
[..]
quoted
quoted
Now w.r.t to failures, can device-dax do an opportunistic huge page
usage?
device-dax explicitly disclaims the ability to do opportunistic mappings.
quoted
I haven't looked at the device-dax details fully yet. Do we make the
assumption of the mapping page size as a format w.r.t device-dax? Is that
derived from nd_pfn->align value?
Correct.
quoted
Here is what I am working on:
1) If the platform doesn't support huge page and if the device superblock
indicated that it was created with huge page support, we fail the device
init.
Ok.
quoted
2) Now if we are creating a new namespace without huge page support in
the platform, then we force the align details to PAGE_SIZE. In such a
configuration when handling dax fault even with THP enabled during
the build, we should not try to use hugepage. This I think we can
achieve by using TRANSPARENT_HUGEPAEG_DAX_FLAG.
How is this dynamic property communicated to the guest?
via device tree on powerpc. We have a device tree node indicating
supported page sizes.
Ah, ok, yeah let's plumb that straight to the device-dax driver and
leave out the interaction / interpretation of the thp-enabled flags.
quoted
quoted
Also even if the user decided to not use THP, by
echo "never" > transparent_hugepage/enabled , we should continue to map
dax fault using huge page on platforms that can support huge pages.

This still doesn't cover the details of a device-dax created with
PAGE_SIZE align later booted with a kernel that can do hugepage dax.How
should we handle that? That makes me think, this should be a VMA flag
which got derived from device config? May be use VM_HUGEPAGE to indicate
if device should use a hugepage mapping or not?
device-dax configured with PAGE_SIZE always gets PAGE_SIZE mappings.
Now what will be page size used for mapping vmemmap?
That's up to the architecture's vmemmap_populate() implementation.
Architectures
possibly will use PMD_SIZE mapping if supported for vmemmap. Now a
device-dax with struct page in the device will have pfn reserve area aligned
to PAGE_SIZE with the above example? We can't map that using
PMD_SIZE page size?
IIUC, that's a different alignment. Currently that's handled by
padding the reservation area up to a section (128MB on x86) boundary,
but I'm working on patches to allow sub-section sized ranges to be
mapped.

Now, that said, I expect there may be bugs lurking in the
implementation if PAGE_SIZE changes from one boot to the next simply
because I've never tested that.

I think this also indicates that the section padding logic can't be
removed until all arch vmemmap_populate() implementations understand
the sub-section case.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help