Thread (53 messages) 53 messages, 13 authors, 2024-12-06

Re: [RFC PATCH v1 00/57] Boot-time page size selection for arm64

From: Ryan Roberts <ryan.roberts@arm.com>
Date: 2024-10-16 08:23:56
Also in: linux-mm, lkml

On 15/10/2024 19:38, Michael Kelley wrote:
From: Ryan Roberts <ryan.roberts@arm.com> Sent: Monday, October 14, 2024 3:55 AM
quoted
Hi All,

Patch bomb incoming... This covers many subsystems, so I've included a core set
of people on the full series and additionally included maintainers on relevant
patches. I haven't included those maintainers on this cover letter since the
numbers were far too big for it to work. But I've included a link to this cover
letter on each patch, so they can hopefully find their way here. For follow up
submissions I'll break it up by subsystem, but for now thought it was important
to show the full picture.

This RFC series implements support for boot-time page size selection within the
arm64 kernel. arm64 supports 3 base page sizes (4K, 16K, 64K), but to date, page
size has been selected at compile-time, meaning the size is baked into a given
kernel image. As use of larger-than-4K page sizes become more prevalent this
starts to present a problem for distributions. Boot-time page size selection
enables the creation of a single kernel image, which can be told which page size
to use on the kernel command line.

Why is having an image-per-page size problematic?
=================================================

Many traditional distros are now supporting both 4K and 64K. And this means
managing 2 kernel packages, along with drivers for each. For some, it means
multiple installer flavours and multiple ISOs. All of this adds up to a
less-than-ideal level of complexity. Additionally, Android now supports 4K and
16K kernels. I'm told having to explicitly manage their KABI for each kernel is
painful, and the extra flash space required for both kernel images and the
duplicated modules has been problematic. Boot-time page size selection solves
all of this.

Additionally, in starting to think about the longer term deployment story for
D128 page tables, which Arm architecture now supports, a lot of the same
problems need to be solved, so this work sets us up nicely for that.

So what's the down side?
========================

Well nothing's free; Various static allocations in the kernel image must be
sized for the worst case (largest supported page size), so image size is in line
with size of 64K compile-time image. So if you're interested in 4K or 16K, there
is a slight increase to the image size. But I expect that problem goes away if
you're compressing the image - its just some extra zeros. At boot-time, I expect
we could free the unused static storage once we know the page size - although
that would be a follow up enhancement.

And then there is performance. Since PAGE_SIZE and friends are no longer
compile-time constants, we must look up their values and do arithmetic at
runtime instead of compile-time. My early perf testing suggests this is
inperceptible for real-world workloads, and only has small impact on
microbenchmarks - more on this below.
 
[snip]

This is pretty cool. :-)  FWIW, I've built a kernel with this patch set, and
have it running in a RHEL 8.7 guest on Hyper-V in the Azure public cloud.
Ran with 4K, 16K, and 64K page sizes, and the basic smoke tests work.
That's great to hear - thanks for taking the time to test!
The Hyper-V specific code in the Linux kernel needed a few tweaks to
deal with PAGE_SIZE and friends no longer being constant, but it's nothing
significant. Getting the kernel built in the first place was a little harder
because my .config file is fairly generic with a lot of device drivers and file
system code that aren't really needed for Hyper-V guests. I had to
weed out the ones that won't build. My RHEL 8.7 install uses LVM, so I> hacked the 'dm' code to make it compile and run.
Yeah, getting all this sorted is going to be the long tail. I feel I've had
enough positive response to this RFC that I should probably just get on and
start that work to get a real feel for how much of it there is going to be.
As this work moves forward, I can supply the necessary patches for
the Hyper-V support.  Let me know if you want to include them in the
main patch set.
Great! If you are happy to forward them to me, I'll include them in future
versions of the series (or more likely, serieses).

Thanks,
Ryan
I've added a couple of Microsoft's Linux people to this email's addressee
list so they are aware of what's going on.

Michael Kelley
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help