Thread (40 messages) 40 messages, 12 authors, 2024-10-02

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

From: Lorenzo Stoakes <hidden>
Date: 2024-09-06 09:54:44
Also in: linux-alpha, linux-arch, linux-kselftest, linux-mips, linux-mm, linux-s390, linux-sh, lkml, loongarch, sparclinux

(Sorry having issues with my IPv6 setup that duplicated the original email...

On Fri, Sep 06, 2024 at 09:14:08AM GMT, Arnd Bergmann wrote:
On Fri, Sep 6, 2024, at 08:14, Lorenzo Stoakes wrote:
quoted
On Fri, Sep 06, 2024 at 07:17:44AM GMT, Arnd Bergmann wrote:
quoted
On Thu, Sep 5, 2024, at 21:15, Charlie Jenkins wrote:
quoted
Create a personality flag ADDR_LIMIT_47BIT to support applications
that wish to transition from running in environments that support at
most 47-bit VAs to environments that support larger VAs. This
personality can be set to cause all allocations to be below the 47-bit
boundary. Using MAP_FIXED with mmap() will bypass this restriction.

Signed-off-by: Charlie Jenkins <redacted>
I think having an architecture-independent mechanism to limit the size
of the 64-bit address space is useful in general, and we've discussed
the same thing for arm64 in the past, though we have not actually
reached an agreement on the ABI previously.
The thread on the original proposals attests to this being rather a fraught
topic, and I think the weight of opinion was more so in favour of opt-in
rather than opt-out.
You mean opt-in to using the larger addresses like we do on arm64 and
powerpc, while "opt-out" means a limit as Charlie suggested?
I guess I'm not using brilliant terminology here haha!

To clarify - the weight of opinion was for a situation where the address
space is limited, except if you set a hint above that (you could call that
opt-out or opt-in depending which way you look at it, so yeah ok very
unclear sorry!).

It was against the MAP_ flag and also I think a _flexible_ per-process
limit is also questionable as you might end up setting a limit which breaks
something else, and this starts getting messy quick.

To be clear, the ADDR_LIMIT_47BIT suggestion is absolutely a compromise and
practical suggestion.
quoted
quoted
quoted
@@ -22,6 +22,7 @@ enum {
 	WHOLE_SECONDS =		0x2000000,
 	STICKY_TIMEOUTS	=	0x4000000,
 	ADDR_LIMIT_3GB = 	0x8000000,
+	ADDR_LIMIT_47BIT = 	0x10000000,
};
I'm a bit worried about having this done specifically in the
personality flag bits, as they are rather limited. We obviously
don't want to add many more such flags when there could be
a way to just set the default limit.
Since I'm the one who suggested it, I feel I should offer some kind of
vague defence here :)

We shouldn't let perfect be the enemy of the good. This is a relatively
straightforward means of achieving the aim (assuming your concern about
arch_get_mmap_end() below isn't a blocker) which has the least impact on
existing code.

Of course we can end up in absurdities where we start doing
ADDR_LIMIT_xxBIT... but again - it's simple, shouldn't represent an
egregious maintenance burden and is entirely opt-in so has things going for
it.
I'm more confused now, I think most importantly we should try to
handle this consistently across all architectures. The proposed
implementation seems to completely block addresses above BIT(47)
even for applications that opt in by calling mmap(BIT(47), ...),
which seems to break the existing applications.
Hm, I thought the commit message suggested the hint overrides it still?

The intent is to optionally be able to run a process that keeps higher bits
free for tagging and to be sure no memory mapping in the process will
clobber these (correct me if I'm wrong Charlie! :)

So you really wouldn't want this if you are using tagged pointers, you'd
want to be sure literally nothing touches the higher bits.
If we want this flag for RISC-V and also keep the behavior of
defaulting to >BIT(47) addresses for mmap(0, ...) how about
changing arch_get_mmap_end() to return the limit based on
ADDR_LIMIT_47BIT and then make this default to enabled on
arm64 and powerpc but disabled on riscv?
But you wouldn't necessarily want all processes to be so restricted, I
think this is what Charlie's trying to avoid :)

On the ohter hand - I'm not sure there are many processes on any arch
that'd want the higher mappings.

So that'd push us again towards risc v just limiting to 48-bits and only
mapping above this if a hint is provided like x86-64 does (and as you
mentioned via irc - it seems risc v is an outlier in that
DEFAULT_MAP_WINDOW == TASK_SIZE).

This would be more consistent vs. other arches.
quoted
quoted
It's also unclear to me how we want this flag to interact with
the existing logic in arch_get_mmap_end(), which attempts to
limit the default mapping to a 47-bit address space already.
How does ADDR_LIMIT_3GB presently interact with that?
That is x86 specific and only relevant to compat tasks, limiting
them to 3 instead of 4 GB. There is also ADDR_LIMIT_32BIT, which
on arm32 is always set in practice to allow 32-bit addressing
as opposed to ARMv2 style 26-bit addressing (IIRC ARMv3 supported
both 26-bit and 32-bit addressing, while ARMv4 through ARMv7 are
32-bit only.
OK, I understand what it's for, I missed it was arch-specific bit, urgh.

I'd say this limit should be min of the arch-specific limit vs. the 48-bit
limit. If you have a 36-bit address space obviously it'd be rather unwise
to try to provide 48 bit addresses..
      Arnd
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help