Thread (30 messages) 30 messages, 11 authors, 2024-10-24

Re: [PATCH RFC v2 0/4] mm: Introduce MAP_BELOW_HINT

From: Charlie Jenkins <hidden>
Date: 2024-09-05 17:26:58
Also in: linux-alpha, linux-arch, linux-kselftest, linux-mips, linux-mm, linux-s390, linux-sh, lkml, loongarch, sparclinux

On Thu, Sep 05, 2024 at 09:47:47AM +0300, Kirill A. Shutemov wrote:
On Thu, Aug 29, 2024 at 12:15:57AM -0700, Charlie Jenkins wrote:
quoted
Some applications rely on placing data in free bits addresses allocated
by mmap. Various architectures (eg. x86, arm64, powerpc) restrict the
address returned by mmap to be less than the 48-bit address space,
unless the hint address uses more than 47 bits (the 48th bit is reserved
for the kernel address space).

The riscv architecture needs a way to similarly restrict the virtual
address space. On the riscv port of OpenJDK an error is thrown if
attempted to run on the 57-bit address space, called sv57 [1].  golang
has a comment that sv57 support is not complete, but there are some
workarounds to get it to mostly work [2].

These applications work on x86 because x86 does an implicit 47-bit
restriction of mmap() address that contain a hint address that is less
than 48 bits.

Instead of implicitly restricting the address space on riscv (or any
current/future architecture), a flag would allow users to opt-in to this
behavior rather than opt-out as is done on other architectures. This is
desirable because it is a small class of applications that do pointer
masking.
This argument looks broken to me.

The "small class of applications" is going to be broken unless they got
patched to use your new mmap() flag. You are asking for bugs.

Consider the case when you write, compile and validate a piece of software
on machine that has <=47bit VA. The binary got shipped to customers.
Later, customer gets a new shiny machine that supports larger address
space and your previously working software is broken. Such binaries might
exist today.

It is bad idea to use >47bit VA by default. Most of software got tested on
x86 with 47bit VA.

We can consider more options to opt-in into wider address space like
personality or prctl() handle. But opt-out is no-go from what I see.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov
riscv is in an interesting state in regards to this because the software
ecosystem is much less mature than other architectures. The existing
riscv hardware supports either 38 or 47 bit userspace VAs, but a lot of
people test on QEMU which defaults to 56 bit. As a result, a lot of
code is tested with the larger address space. Applications that don't
work on the larger address space, like OpenJDK, currently throw an error
and exit.

Since riscv does not currently have the address space default to 47
bits, some applications just don't work on 56 bits. We could change the
kernel so that these applications start working without the need for
them to change their code, but that seems like the kernel is
overstepping and fixing binaries rather than providing users tools to
fix the binaries themselves.

This mmap flag was an attempt to provide a tool for these applications
that work on the existing 47 bit VA hardware to also work on different
hardware that supports a 56 bit VA space. After feedback, it looks like
a better solution than the mmap flag is to use the personality syscall
to set a process wide restriction to 47 bits instead, which matches the
32 bit flag that already exists.

- Charlie

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help