Thread (28 messages) 28 messages, 4 authors, 2024-05-26

Re: [RFC PATCH 1/8] mm: Provide pagesize to pmd_populate()

From: Christophe Leroy <hidden>
Date: 2024-03-27 09:58:38
Also in: linux-mm, lkml


Le 26/03/2024 à 16:01, Jason Gunthorpe a écrit :
On Mon, Mar 25, 2024 at 07:05:01PM +0000, Christophe Leroy wrote:
quoted
Not looked into details yet, but I guess so.

By the way there is a wiki dedicated to huge pages on powerpc, you can
have a look at it here :
https://github.com/linuxppc/wiki/wiki/Huge-pages , maybe you'll find
good ideas there to help me.
There sure are alot of page tables types here

I'm a bit wondering about terminology, eg on the first diagram "huge
pte entry" means a PUD entry that is a leaf? Which ones are contiguous
replications?
Yes, on the first diagram, a huge pte entry covering the same size as 
pud entry means a leaf PUD entry.

Contiguous replications are only on 8xx for the time being and are 
displayed as "consecutive entries".
Just general remarks on the ones with huge pages:

  hash 64k and hugepage 16M/16G
  radix 64k/radix hugepage 2M/1G
  radix 4k/radix hugepage 2M/1G
  nohash 32
   - I think this is just a normal x86 like scheme? PMD/PUD can be a
     leaf with the same size as a next level table.

     Do any of these cases need to know the higher level to parse the
     lower? eg is there a 2M bit in the PUD indicating that the PMD
     is a table of 2M leafs or does each PMD entry have a bit
     indicating it is a leaf?
For hash and radix there is a bit that tells it is leaf (_PAGE_PTE)

For nohash32/e500 I think the drawing is not full right, there is a huge 
page directory (hugepd) with a single entry. I think it should be 
possible to change it to a leaf entry, it seems we have bit _PAGE_SW1 
available in the PTE.
  hash 4k and hugepage 16M/16G
  nohash 64
   - How does this work? I guess since 8xx explicitly calls out
     consecutive this is actually the pgd can point to 512 256M
     entries or 8 16G entries? Ie the table size at each level is
     varable? Or is it the same and the table size is still 512 and
     each 16G entry is replicated 64 times?
For those it is using the huge page directory (hugepd) which can be 
hooked at any level and is a directory of huge pages on its own. There 
is no consecutive entries involved here I think, allthough I'm not 
completely sure.

For hash4k I'm not sure how it works, this was changed by commit 
e2b3d202d1db ("powerpc: Switch 16GB and 16MB explicit hugepages to a 
different page table format")

For the nohash/64, a PGD entry points either to a regular PUD directory 
or to a HUGEPD directory. The size of the HUGEPD directory is encoded in 
the 6 lower bits of the PGD entry.
     Do the offset accessors already abstract this enough?

  8xx 4K
  8xx 16K
    - As this series does?
This is how it is prior to the series, ie 16k and 512k pages are 
implemented as contiguous PTEs in a standard page table while 8M pages 
are implemented with hugepd and a single entry in it (with two PGD 
entries pointing to the same huge page directory.

Christophe
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help