Re: PPC476 hangs during tlb flush after calling /init in crash kernel with linux 5.4+
From: Christophe Leroy <hidden>
Date: 2021-04-28 06:08:26
Also in:
lkml
Le 28/04/2021 à 00:42, Eddie James a écrit :
On Tue, 2021-04-27 at 19:26 +0200, Christophe Leroy wrote:quoted
Hi Eddies, Le 27/04/2021 à 19:03, Eddie James a écrit :quoted
Hi all, I'm having a problem in simulation and hardware where my PPC476 processor stops executing instructions after callling /init. In my case this is a bash script. The code descends to flush the TLB, and somewhere in the loop in _tlbil_pid, the PC goes to InstructionTLBError47x but does not go any further. This only occurs in the crash kernel environment, which is using the same kernel, initramfs, and init script as the main kernel, which executed fine. I do not see this problem with linux 4.19 or 3.10. I do see it with 5.4 and 5.10. I see a fair amount of refactoring in the PPC memory management area between 4.19 and 5.4. Can anyone point me in a direction to debug this further? My stack trace is below as I can run gdb in simulation.Can you bisect to pin point the culprit commit ?Hi, thanks for your prompt reply. Good idea! I have bisected to: commit 9e849f231c3c72d4c3c1b07c9cd19ae789da0420 (b8-bad, refs/bisect/bad) Author: Christophe Leroy [off-list ref] Date: Thu Feb 21 19:08:40 2019 +0000 powerpc/mm/32s: use generic mmu_mapin_ram() for all blocks. Now that mmu_mapin_ram() is able to handle other blocks than the one starting at 0, the WII can use it for all its blocks. Signed-off-by: Christophe Leroy [off-list ref] Signed-off-by: Michael Ellerman [off-list ref] I also confirmed that reverting this commit resolves the issue in 5.4+. Now, I don't understand why this is problematic or what is really happening... Reverting is probably not the desired solution.
Can you provide the 'dmesg' or a dump of the logs printed by the kernel at boottime ? The difference with this commit is that if there are several memblocks, all get mapped. Maybe your target doesn't like it. You are talking about simulation, are you using QEMU ? If yes can you provide details so that I can try and reproduce the issue ? Thanks Christophe