Re: [PATCH v6 0/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up
From: Steven Rostedt <rostedt@goodmis.org>
Date: 2024-06-17 21:18:19
Also in:
lkml
On Mon, 17 Jun 2024 23:01:12 +0200 Alexander Graf [off-list ref] wrote:
quoted
This could be an added feature, but it is very architecture specific, and would likely need architecture specific updates.It definitely would be an added feature, yes. But one that allows you to ensure persistence a lot more safely :).
Sure.
Thinking about it again: What if you run the allocation super early (see arch/x86/boot/compressed/kaslr.c:handle_mem_options())? If you stick to allocating only from top, you're effectively kernel version independent for your allocations because none of the kernel code ran yet and definitely KASLR independent because you're running deterministically before KASLR even gets allocated.quoted
As this code relies on memblock_phys_alloc() being consistent, if something gets allocated before it differently depending on where the kernel is, it can also move the location. A plugin to UEFI would mean that it would need to reserve the memory, and the code here will need to know where it is. We could always make the function reserve_mem() global and weak so that architectures can override it.Yes, the in-kernel UEFI loader (efi-stub) could simply populate a new type of memblock with the respective reservations and you later call memblock_find_in_range_node() instead of memblock_phys_alloc() to pass in flags that you want to allocate only from the new MEMBLOCK_RESERVE_MEM type. The same model would work for BIOS boots through the handle_mem_options() path above. In fact, if the BIOS way works fine, we don't even need UEFI variables: The same way allocations will be identical during BIOS execution, they should stay identical across UEFI launches. As cherry on top, kexec also works seamlessly with the special memblock approach because kexec (at least on x86) hands memblocks as is to the next kernel. So the new kernel will also automatically use the same ranges for its allocations.
I'm all for expanding this. But I would just want to get this in for now as is. It theoretically works on all architectures. If someone wants to make in more robust and accurate on a specific architecture, I'm all for it. Like I said, we could make the reserver_mem() function global and weak, and then if an architecture has a better way to handle this, it could use that. Hmm, x86 could do this with the e820 code like I did in my first versions. Like I said, it didn't fail at all with that. And we can have an UEFI version as well. -- Steve