Thread (29 messages) 29 messages, 10 authors, 2020-03-11

Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU

From: Arnd Bergmann <arnd@arndb.de>
Date: 2020-03-09 19:46:42
Also in: linux-fsdevel, linux-mm, lkml

On Mon, Mar 9, 2020 at 5:09 PM Russell King - ARM Linux admin
[off-list ref] wrote:
On Mon, Mar 09, 2020 at 03:59:45PM +0000, Catalin Marinas wrote:
quoted
On Sun, Mar 08, 2020 at 11:58:52AM +0100, Arnd Bergmann wrote:
quoted
- revisit CONFIG_VMSPLIT_4G_4G for arm32 (and maybe mips32)
  to see if it can be done, and what the overhead is. This is probably
  more work than the others combined, but also the most promising
  as it allows the most user address space and physical ram to be used.
A rough outline of such support (and likely to miss some corner cases):

1. Kernel runs with its own ASID and non-global page tables.

2. Trampoline code on exception entry/exit to handle the TTBR0 switching
   between user and kernel.

3. uaccess routines need to be reworked to pin the user pages in memory
   (get_user_pages()) and access them via the kernel address space.

Point 3 is probably the ugliest and it would introduce a noticeable
slowdown in certain syscalls.
There are probably a number of ways to do the basic design. The idea
I had (again, probably missing more corner cases than either of you
two that actually understand the details of the mmu):

- Assuming we have LPAE, run the kernel vmlinux and modules inside
  the vmalloc space, in the top 256MB or 512MB on TTBR1

- Map all the physical RAM (up to 3.75GB) into a reserved ASID
  with TTBR0

- Flip TTBR0 on kernel entry/exit, and again during user access.

This is probably more work to implement than your idea, but
I would hope this has a lower overhead on most microarchitectures
as it doesn't require pinning the pages. Depending on the
microarchitecture, I'd hope the overhead would be comparable
to that of ARM64_SW_TTBR0_PAN.
We also need to consider that it has implications for the single-kernel
support; a kernel doing this kind of switching would likely be horrid
for a kernel supporting v6+ with VIPT aliasing caches.  Would we be
adding a new red line between kernels supporting VIPT-aliasing caches
(present in earlier v6 implementations) and kernels using this system?
I would initially do it for LPAE only, given that this is already an
incompatible config option. I don't think there are any v6 machines with
more than 1GB of RAM (the maximum for AST2500), and the only distro
that ships a v6+ multiplatform kernel is Raspbian, which in turn needs
a separate LPAE kernel for the large-memory machines anyway.

Only doing it for LPAE would still cover the vast majority of systems that
actually shipped with more than 2GB. There are a couple of exceptions,
i.e. early  Cubox i4x4, the Calxeda Highbank developer system and the
Novena Laptop, which I would guess have a limited life expectancy
(before users stop updating kernels) no longer than the 8GB
Keystone-2.

Based on that, I would hope that the ARMv7 distros can keep shipping
the two kernel images they already ship:

- The non-LPAE kernel modified to VMSPLIT_2G_OPT, not using highmem
  on anything up to 2GB, but still supporting the handful of remaining
  Cortex-A9s with 4GB using highmem until they are completely obsolete.

- The LPAE kernel modified to use a newly added VMSPLIT_4G_4G,
   with details to be worked out.

Most new systems tend to be based on Cortex-A7 with no more than 2GB,
so those could run either configuration well.  If we find the 2GB of user
address space too limiting for the non-LPAE config, or I missed some
important pre-LPAE systems with 4GB that need to be supported for longer
than other highmem systems, that can probably be added later.

    Arnd

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help