Re: [PATCH] net: randomize layout of struct net_device
From: Eric Dumazet <edumazet@google.com>
Date: 2025-06-06 15:43:08
Also in:
linux-kernel-mentees, lkml
On Fri, Jun 6, 2025 at 7:55 AM Pranav Tyagi [off-list ref] wrote:
On Tue, Jun 3, 2025 at 12:36 AM Andrew Lunn [off-list ref] wrote:quoted
On Mon, Jun 02, 2025 at 11:03:18AM -0700, Kees Cook wrote:quoted
On Mon, Jun 02, 2025 at 04:46:14PM +0200, Andrew Lunn wrote:quoted
On Mon, Jun 02, 2025 at 07:29:32PM +0530, Pranav Tyagi wrote:quoted
Add __randomize_layout to struct net_device to support structure layout randomization if CONFIG_RANDSTRUCT is enabled else the macro expands to do nothing. This enhances kernel protection by making it harder to predict the memory layout of this structure. Link: https://github.com/KSPP/linux/issues/188I would note that the TODO item in this Issue is "evaluate struct net_device".quoted
A dumb question i hope. As you can see from this comment, some time and effort has been put into the order of members in this structure so that those which are accessed on the TX fast path are in the same cache line, and those on the RX fast path are in the same cache line, and RX and TX fast paths are in different cache lines, etc.This is pretty well exactly one of the right questions to ask, and should be detailed in the commit message. Mainly: a) how do we know it will not break anything? b) why is net_device a struct that is likely to be targeted by an attacker?For a), i doubt anything will break. The fact the structure has been optimised for performance implies that members have been moved around, and there are no comments in the structure saying don't move this, otherwise bad things will happen. There is a: u8 priv[] ____cacheline_aligned __counted_by(priv_len); at the end, but i assume RANDSTRUCT knows about them and won't move it. As for b), i've no idea, not my area. There are a number of pointers to structures contains ops. Maybe if you can take over those pointers, point to something you can control, you can take control of the Program Counter?quoted
quoted
Does CONFIG_RANDSTRUCT understand this? It is safe to move members around within a cache line. And it is safe to move whole cache lines around. But it would be bad if the randomisation moved members between cache lines, mixing up RX and TX fast path members, or spreading fast path members over more cache lines, etc.No, it'll move stuff all around. It's very much a security vs performance trade-off, but the systems being built with it are happy to take the hit.It would be interesting to look back at the work optimising this stricture to get a ball park figure how big a hit this is? I also think some benchmark numbers would be interesting. I would consider two different systems: 1) A small ARM/MIPS/RISC-V with 1G interfaces. The low amount of L1 cache on these systems mean that cache misses are important. So spreading out the fast path members will be bad. 2) Desktop/Server class hardware, lots of cores, lots of cache, 10G, 40G or 100G interfaces. For these systems, i expect cache line bouncing is more of an issue, so Rx and Tx fast path members want to be kept in separate cache lines.quoted
The basic details are in security/Kconfig.hardening in the "choice" following the CC_HAS_RANDSTRUCT entry.So i see two settings here. It looks like RANDSTRUCT_PERFORMANCE should have minimal performance impact, so maybe this should be mentioned in the commit message, and the benchmarks performed both on full randomisation and with the performance setting. I would also suggest a comment is added to the top of Documentation/networking/net_cachelines/net_device.rst pointing out this assumed RANDSTRUCT is disabled, and the existing comment in struct net_device is also updated. AndrewResending to the list—my previous reply was accidentally sent off-list. Apologies for the delayed response, and thank you all for the detailed feedback. Regarding the concern about breaking functionality, I did compile and boot the kernel successfully with this change, and everything appeared to work as expected during basic testing. However, I agree that this is not a substitute for thorough benchmarking. You're absolutely right that applying __randomize_layout to net_device will shuffle structure fields and likely incur a performance penalty. As mentioned, this is a trade-off that targets hardening over performance. It's worth noting that CONFIG_RANDSTRUCT has two options: RANDSTRUCT_FULL and RANDSTRUCT_PERFORMANCE, with the latter aiming to minimize the impact by only shuffling less performance-critical members. I’d appreciate guidance on which specific benchmarking tests would be most appropriate to quantify the performance impact. Based on your suggestions, I plan to run benchmarks on a small SoC (ARM/MIPS/RISC-V) with 1G NICs. However, I currently don’t have access to high-end server hardware with 10G/40G+ NICs, so I’ll start with the systems I have and clearly note the limitations in the revised commit message. I’ll also update the commit message to reflect the security vs performance trade-offs, mention RANDSTRUCT_PERFORMANCE, and add a reference to net_cachelines/net_device.rst to document the assumption of structure layout. Thanks again for the thoughtful review—I’ll revise the patch accordingly.
Do you have evidence of added security on this particular structure ? What particular bug could have been avoided with __randomize_layout ? Most distros use CONFIG_RANDSTRUCT_NONE=y, I do not think __randomize_layout has a future.