Thread (26 messages) 26 messages, 5 authors, 2026-01-19

Re: [PATCH v3 0/3] Fix bugs and performance of kstack offset randomisation

From: David Laight <hidden>
Date: 2026-01-19 12:22:55
Also in: linux-arm-kernel, linux-hardening, linux-riscv, linux-s390, lkml, loongarch

On Mon, 19 Jan 2026 10:52:59 +0000
Mark Rutland [off-list ref] wrote:
On Fri, Jan 02, 2026 at 01:11:51PM +0000, Ryan Roberts wrote:
quoted
Hi All,  
Hi Ryan,
quoted
As I reported at [1], kstack offset randomisation suffers from a couple of bugs
and, on arm64 at least, the performance is poor. This series attempts to fix
both; patch 1 provides back-portable fixes for the functional bugs. Patches 2-3
propose a performance improvement approach.

I've looked at a few different options but ultimately decided that Jeremy's
original prng approach is the fastest. I made the argument that this approach is
secure "enough" in the RFC [2] and the responses indicated agreement.  
FWIW, the series all looks good to me. I understand you're likely to
spin a v4 with a couple of minor tweaks (fixing typos and adding an
out-of-line wrapper for a prandom function), but I don't think there's
anything material that needs to change.

I've given my Ack on all three patches. I've given the series a quick
boot test (atop v6.19-rc4) with a bunch of debug options enabled, and
all looks well.

Kees, do you have any comments? It would be nice if we could queue this
up soon.
I don't want to stop this being queued up in its current form.
But I don't see an obvious need for multiple per-cpu prng
(there are a couple of others lurking), surely one will do.

How much overhead does the get_cpu_var() add?
I think it has to disable pre-emption (or interrupts) which might
be more expensive on non-x86 (which can just do 'inc %gs:address').

I'm sure I remember a version that used a per-task prng.
That just needs 'current' - which might be known and/or be cheaper
to get.
(Although I also remember a reference some system where it was slow...)

The other option is just to play 'fast and loose' with the prng data.
Using the state from the 'wrong cpu' (if the code is pre-empted) won't
really matter.
You might get a RrwW (or even RrwrwW) sequence, but the prng won't be used
for anything 'really important' so it shouldn't matter.

	David
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help