Re: [PATCH 1/6 v14] ARM: Handle a device tree in lowmem
From: Ard Biesheuvel <ardb@kernel.org>
Date: 2020-10-05 09:15:38
On Mon, 5 Oct 2020 at 09:14, Ard Biesheuvel [off-list ref] wrote:
Hi Linus, Thanks for the elaborate explanation. On Sun, 4 Oct 2020 at 22:50, Linus Walleij [off-list ref] wrote:quoted
On Fri, Oct 2, 2020 at 1:01 PM Ard Biesheuvel [off-list ref] wrote:quoted
On Thu, 1 Oct 2020 at 17:22, Linus Walleij [off-list ref] wrote:quoted
OK, so if I am understanding this correctly, the root problem is that the kernel unmaps the memory that the attached DTB resides in, right?Yups, or, well in some places the kernel knows that the DT is there so it sets up two 1MB sections over it, and then it assumes "no-one is ever gonna touch those two section mappings, OK".OK, so from your explanation, I gathered that: - the 'appended DTB' is not appended anymore when the core kernel boots, and the problem is caused by the fact that the decompressor blindly chooses a location for it, whereas the firmware makes a more informed decision when it places the DT in memory - the memory holding the DT may be unmapped inadvertently - the memory holding the DT may be wiped inadvertently. ...quoted
The kernel actually fits in the first memblock, but then it clears the lowmem right below itself and by that point the MMU has figured out that there is another memblock below it that it will happily use for lowmem and just goes ahead and wipes that. The code has no awareness that there might be a DTB there.Where in the code does this happen? It seems to me that at this point, the DT memory should have been memblock_reserve()d, and the code in question should disregard it from wiping.quoted
The decompressor seems to have always been blissfully ignorant about what happens with the attached DTB after it just pushed it a bit upwards in memory (if relocating) it just passes the location in r2 in accordance with the boot specification, to me it seems more like something the kernel proper should handle. I don't think that loading the DTB separately to some high address as advocated by many peope is much better. It can create the same problem if loaded in the wrong place and possibly be placed in other dangerous areas like inside the VMALLOC area where it can get its mappings destroyed at any instance. (We don't check for that either.)I think putting the mapping of the DT outside of the linear region is reasonable, tbh, and this is what we do on arm64. That way, the firmware does not have to care at all about the MM configuration of the kernel, and it could simply put it in high memory as well. One option would be to create a virtual mapping for the DT at the base of the modules region. This takes up 1 MB in the typical (non-LPAE) case, and 4 MB in the worst case, making it more likely that you will run out of module space, but we already have infrastructure to deal with that. ...quoted
So to summarize: - The kernel decompressor just moves the kernel and the attached DTB upward in memory so it will not be overwritten by the decompressor. - This will sometimes make part of the compressed kernel and DTB end up above the first memblock. - This works because there is more memory above the first memblock, in an adjacent memblock. (The decompressor has always assumed so much) - The kernel then puts the lowmem mappings from the end of the first memblock to VMALLOC_START and clear the PMDs - If the DTB has been pushed up to lowmem, the two PMDs over the DTB will be cleared, resulting in a crash. Maybe I should just put all of this into the commit message so people can see the mess :DI'd prefer to fix this in a more structural way, tbh. Let me see if I can code up a PoC
I pushed a branch to https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-dt-mapping that moves the DT mapping to a read-only region at the top of the kernel VA space: there happened to be a 4 MB hole there (between VMALLOC_END and FIXADDR_START) that we can use, even if the purpose of that hole was as a guard region, as a read-only mapping still catches stray writes. What I don't get is why the DT *contents* get clobbered - arm_memblock_init() memblock_reserve's the DT contents, and wiping reserved memblocks is something we really shouldn't be doing. -- Ard. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel