Re: [PATCH v3 3/3] randomize_kstack: Unify random source across arches
From: David Laight <hidden>
Date: 2026-01-12 13:36:24
Also in:
linux-arm-kernel, linux-hardening, linux-riscv, linux-s390, lkml, loongarch
On Mon, 12 Jan 2026 12:26:26 +0000 Ryan Roberts [off-list ref] wrote:
On 07/01/2026 14:05, David Laight wrote:quoted
On Sun, 4 Jan 2026 23:01:36 +0000 David Laight [off-list ref] wrote:
...
quoted
I've trimmed the initialiser - it is very boring. The code to create the initialiser is actually slightly smaller than it is. Doable by hand provided you can do 128bit shift and xor without making any mistakes. I've just done a quick search through the kernel sources and haven't found many uses of prandom_u32_state() outside of test code. There is sched_rng() which uses a per-cpu rng to throw a 1024 sized die. bpf also has a per-cpu one for 'unprivileged user space'. net/sched/sch_netem.c seems to use one - mostly for packet loss generation. Since the randomize_kstack code is now using a per-task rng (initialised by clone?) that could be used instead of all the others provided they are run when 'current' is valid. But the existing prandom_u32_state() needs a big health warning that four outputs leak the entire state. That is fixable by changing the last line to: return state->s1 + state->s2 + state->s3 + state->s4; That only affects the output value, the period is unchanged.Hi David, This all seems interesting, but I'm not clear that it is a blocker for this series. As I keep saying, we only use 6 bits for offset randmization so it is trival to brute force, regardless of how easy it is to recover the prng state. Perhaps we can decouple these 2 things and make them independent: - this series, which is motivated by speeding up syscalls on arm64; given 6 bits is not hard to brute force, spending a lot of cycles calculating those bits is unjustified. - Your observation that that the current prng could be improved to make recoving it's state harder. What do you think?
They are separate. I should have a 'mostly written' patch series for prandom_u32_state(). If you unconditionally add a per-task prng there are a few places that could use it instead of a per-cpu one. It could be 'perturbed' during task switch - eg by: s->s1 = (s->s1 ^ something) | 2; (The 2 stops the new value being 0 or 1, losing 1 bit wont be significant.) This one is much nearer 'ready' and has an obvious impact. David
Thanks, Ryanquoted
David