Re: Regression: Linux v5.15+ does not boot on Freescale P2020
From: Pali Rohár <pali@kernel.org>
Date: 2022-07-25 20:10:23
Also in:
lkml
On Monday 25 July 2022 16:20:49 Christophe Leroy wrote:
Le 25/07/2022 à 14:52, Pali Rohár a écrit :quoted
On Monday 25 July 2022 18:20:01 Michael Ellerman wrote:quoted
Pali Rohár [off-list ref] writes:quoted
On Saturday 23 July 2022 14:42:22 Christophe Leroy wrote:quoted
Le 22/07/2022 à 11:09, Pali Rohár a écrit :quoted
Trying to boot mainline Linux kernel v5.15+, including current version from master branch, on Freescale P2020 does not work. Kernel does not print anything to serial console, seems that it does not work and after timeout watchdog reset the board.Can you provide more information ? Which defconfig or .config, which version of gcc, etc ... ?I used default defconfig for mpc85xx with gcc 8, compilation for e500 cores. If you need exact .config content I can send it during week.quoted
quoted
I run git bisect and it found following commit: 9401f4e46cf6965e23738f70e149172344a01eef is the first bad commit commit 9401f4e46cf6965e23738f70e149172344a01eef Author: Christophe Leroy [off-list ref] Date: Tue Mar 2 08:48:11 2021 +0000 powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros Force the eh flag at 0 on PPC32. Signed-off-by: Christophe Leroy [off-list ref] Signed-off-by: Michael Ellerman [off-list ref] Link: https://lore.kernel.org/r/1fc81f07cabebb875b963e295408cc3dd38c8d85.1614674882.git.christophe.leroy@csgroup.eu (local) :040000 040000 fe6747e45736dfcba74914a9445e5f70f5120600 96358d08b65d3200928a973efb5b969b3d45f2b0 M arch If I revert this commit then kernel boots correctly. It also boots fine if I revert this commit on top of master branch. Freescale P2020 has two 32-bit e500 powerpc cores. Any idea why above commit is causing crash of the kernel? And why it is needed? Could eh flag set to 0 cause deadlock?Setting the eh flag to 0 is not supposed to be a change introduced by that commit. Indeed that commit is not supposed to change anything at all in the generated code.My understanding of that commit is that it changed eh flag parameter from 1 to 0 for 32-bit powerpc, including also p2020.Can you compare the disassembly before and after and find a place where an instruction has changed? cheersYes, of course. Here is diff between output from objdump -d vmlinux. original version --- is from git master branch and modified version +++ is the original version with reverted above problematic commit. So the +++ version is the one which is working.--- vmlinux.master.dump 2022-07-25 14:43:45.922239496 +0200 +++ vmlinux.revert.dump 2022-07-25 14:43:49.238259296 +0200@@ -1,5 +1,5 @@ -vmlinux.master: file format elf32-powerpc +vmlinux.revert: file format elf32-powerpc Disassembly of section .head.text:@@ -11213,7 +11213,7 @@ c000b850: 3f a0 c1 0f lis r29,-1611 c000b854: 81 02 00 04 lwz r8,4(r2) c000b858: 3b fd 10 68 addi r31,r29,4200 c000b85c: 39 40 00 01 li r10,1 -c000b860: 7d 20 f8 29 lwarx r9,0,r31,1 +c000b860: 7d 20 f8 28 lwarx r9,0,r31 c000b864: 2c 09 00 00 cmpwi r9,0 c000b868: 40 82 00 10 bne c000b878 <die+0x68> c000b86c: 7d 40 f9 2d stwcx. r10,0,r31That's really strange. I made a try with mpc85xx_defconfig with GCC 11 and I don't get any such difference.
Yes, that is strange...
Does your version of GCC has anything special ?
Nothing. Ordinary Debian 10 amd64 system with cross compiler from gcc-powerpc-linux-gnuspe package (standard version, part of Debian). Now I did again clean test with same Debian 10 cross compiler. $ git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git && cd linux $ git checkout v5.15 $ make mpc85xx_smp_defconfig ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnuspe- $ make vmlinux ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnuspe- $ cp -a vmlinux vmlinux.v5.15 $ git revert 9401f4e46cf6965e23738f70e149172344a01eef $ make vmlinux ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnuspe- $ cp -a vmlinux vmlinux.revert $ powerpc-linux-gnuspe-objdump -d vmlinux.revert > vmlinux.revert.dump $ powerpc-linux-gnuspe-objdump -d vmlinux.v5.15 > vmlinux.v5.15.dump $ diff -Naurp vmlinux.v5.15.dump vmlinux.revert.dump And there are: -c000c304: 7d 20 f8 29 lwarx r9,0,r31,1 +c000c304: 7d 20 f8 28 lwarx r9,0,r31 I guess it must be reproducible this issue as I'm using regular toolchain from distribution. Just to note that I had to apply Makefile patch for CONFIG_E500 https://lore.kernel.org/linuxppc-dev/20220524093939.30927-1-pali@kernel.org/ (local) But I was told that this issue is reproducible also by OpenWRT non-SPE gcc 8 toolchain, without using above Makefile patch. So I have feeling that this is either related to gcc 8 or to binutils. On that Debian is binutils 2.31.1-16. Or maybe something in .config?
Can you send you exact .config ?
.config from the above test case is in the attachment.
Thanks Christophe