Re: [RFC PATCH 1/8] mm: Provide pagesize to pmd_populate()
From: Christophe Leroy <hidden>
Date: 2024-03-27 09:58:38
Also in:
linux-mm, lkml
Le 26/03/2024 à 16:01, Jason Gunthorpe a écrit :
On Mon, Mar 25, 2024 at 07:05:01PM +0000, Christophe Leroy wrote:quoted
Not looked into details yet, but I guess so. By the way there is a wiki dedicated to huge pages on powerpc, you can have a look at it here : https://github.com/linuxppc/wiki/wiki/Huge-pages , maybe you'll find good ideas there to help me.There sure are alot of page tables types here I'm a bit wondering about terminology, eg on the first diagram "huge pte entry" means a PUD entry that is a leaf? Which ones are contiguous replications?
Yes, on the first diagram, a huge pte entry covering the same size as pud entry means a leaf PUD entry. Contiguous replications are only on 8xx for the time being and are displayed as "consecutive entries".
Just general remarks on the ones with huge pages:
hash 64k and hugepage 16M/16G
radix 64k/radix hugepage 2M/1G
radix 4k/radix hugepage 2M/1G
nohash 32
- I think this is just a normal x86 like scheme? PMD/PUD can be a
leaf with the same size as a next level table.
Do any of these cases need to know the higher level to parse the
lower? eg is there a 2M bit in the PUD indicating that the PMD
is a table of 2M leafs or does each PMD entry have a bit
indicating it is a leaf?For hash and radix there is a bit that tells it is leaf (_PAGE_PTE) For nohash32/e500 I think the drawing is not full right, there is a huge page directory (hugepd) with a single entry. I think it should be possible to change it to a leaf entry, it seems we have bit _PAGE_SW1 available in the PTE.
hash 4k and hugepage 16M/16G
nohash 64
- How does this work? I guess since 8xx explicitly calls out
consecutive this is actually the pgd can point to 512 256M
entries or 8 16G entries? Ie the table size at each level is
varable? Or is it the same and the table size is still 512 and
each 16G entry is replicated 64 times?
For those it is using the huge page directory (hugepd) which can be
hooked at any level and is a directory of huge pages on its own. There
is no consecutive entries involved here I think, allthough I'm not
completely sure.
For hash4k I'm not sure how it works, this was changed by commit
e2b3d202d1db ("powerpc: Switch 16GB and 16MB explicit hugepages to a
different page table format")
For the nohash/64, a PGD entry points either to a regular PUD directory
or to a HUGEPD directory. The size of the HUGEPD directory is encoded in
the 6 lower bits of the PGD entry.
Do the offset accessors already abstract this enough?
8xx 4K
8xx 16K
- As this series does?This is how it is prior to the series, ie 16k and 512k pages are implemented as contiguous PTEs in a standard page table while 8M pages are implemented with hugepd and a single entry in it (with two PGD entries pointing to the same huge page directory. Christophe