Thread (16 messages) 16 messages, 5 authors, 2022-08-02

Re: Regression: Linux v5.15+ does not boot on Freescale P2020

From: Pali Rohár <pali@kernel.org>
Date: 2022-07-25 20:10:23
Also in: lkml

On Monday 25 July 2022 16:20:49 Christophe Leroy wrote:
Le 25/07/2022 à 14:52, Pali Rohár a écrit :
quoted
On Monday 25 July 2022 18:20:01 Michael Ellerman wrote:
quoted
Pali Rohár [off-list ref] writes:
quoted
On Saturday 23 July 2022 14:42:22 Christophe Leroy wrote:
quoted
Le 22/07/2022 à 11:09, Pali Rohár a écrit :
quoted
Trying to boot mainline Linux kernel v5.15+, including current version
from master branch, on Freescale P2020 does not work. Kernel does not
print anything to serial console, seems that it does not work and after
timeout watchdog reset the board.
Can you provide more information ? Which defconfig or .config, which
version of gcc, etc ... ?
I used default defconfig for mpc85xx with gcc 8, compilation for e500
cores.

If you need exact .config content I can send it during week.
quoted
quoted
I run git bisect and it found following commit:

9401f4e46cf6965e23738f70e149172344a01eef is the first bad commit
commit 9401f4e46cf6965e23738f70e149172344a01eef
Author: Christophe Leroy [off-list ref]
Date:   Tue Mar 2 08:48:11 2021 +0000

      powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros

      Force the eh flag at 0 on PPC32.

      Signed-off-by: Christophe Leroy [off-list ref]
      Signed-off-by: Michael Ellerman [off-list ref]
      Link: https://lore.kernel.org/r/1fc81f07cabebb875b963e295408cc3dd38c8d85.1614674882.git.christophe.leroy@csgroup.eu (local)

:040000 040000 fe6747e45736dfcba74914a9445e5f70f5120600 96358d08b65d3200928a973efb5b969b3d45f2b0 M      arch


If I revert this commit then kernel boots correctly. It also boots fine
if I revert this commit on top of master branch.

Freescale P2020 has two 32-bit e500 powerpc cores.

Any idea why above commit is causing crash of the kernel? And why it is
needed? Could eh flag set to 0 cause deadlock?
Setting the eh flag to 0 is not supposed to be a change introduced by
that commit. Indeed that commit is not supposed to change anything at
all in the generated code.
My understanding of that commit is that it changed eh flag parameter
from 1 to 0 for 32-bit powerpc, including also p2020.
Can you compare the disassembly before and after and find a place where
an instruction has changed?

cheers
Yes, of course. Here is diff between output from objdump -d vmlinux.
original version --- is from git master branch and modified version +++
is the original version with reverted above problematic commit.
So the +++ version is the one which is working.
--- vmlinux.master.dump	2022-07-25 14:43:45.922239496 +0200
+++ vmlinux.revert.dump	2022-07-25 14:43:49.238259296 +0200
@@ -1,5 +1,5 @@
  
-vmlinux.master:     file format elf32-powerpc
+vmlinux.revert:     file format elf32-powerpc
  
  
  Disassembly of section .head.text:
@@ -11213,7 +11213,7 @@ c000b850:	3f a0 c1 0f 	lis     r29,-1611
  c000b854:	81 02 00 04 	lwz     r8,4(r2)
  c000b858:	3b fd 10 68 	addi    r31,r29,4200
  c000b85c:	39 40 00 01 	li      r10,1
-c000b860:	7d 20 f8 29 	lwarx   r9,0,r31,1
+c000b860:	7d 20 f8 28 	lwarx   r9,0,r31
  c000b864:	2c 09 00 00 	cmpwi   r9,0
  c000b868:	40 82 00 10 	bne     c000b878 <die+0x68>
  c000b86c:	7d 40 f9 2d 	stwcx.  r10,0,r31
That's really strange. I made a try with mpc85xx_defconfig with GCC 11 
and I don't get any such difference.
Yes, that is strange...
Does your version of GCC has anything special ?
Nothing. Ordinary Debian 10 amd64 system with cross compiler from
gcc-powerpc-linux-gnuspe package (standard version, part of Debian).

Now I did again clean test with same Debian 10 cross compiler.

$ git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git && cd linux
$ git checkout v5.15
$ make mpc85xx_smp_defconfig ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnuspe-
$ make vmlinux ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnuspe-
$ cp -a vmlinux vmlinux.v5.15
$ git revert 9401f4e46cf6965e23738f70e149172344a01eef
$ make vmlinux ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnuspe-
$ cp -a vmlinux vmlinux.revert
$ powerpc-linux-gnuspe-objdump -d vmlinux.revert > vmlinux.revert.dump
$ powerpc-linux-gnuspe-objdump -d vmlinux.v5.15 > vmlinux.v5.15.dump
$ diff -Naurp vmlinux.v5.15.dump vmlinux.revert.dump

And there are:

-c000c304:      7d 20 f8 29     lwarx   r9,0,r31,1
+c000c304:      7d 20 f8 28     lwarx   r9,0,r31

I guess it must be reproducible this issue as I'm using regular
toolchain from distribution.

Just to note that I had to apply Makefile patch for CONFIG_E500
https://lore.kernel.org/linuxppc-dev/20220524093939.30927-1-pali@kernel.org/ (local)

But I was told that this issue is reproducible also by OpenWRT non-SPE
gcc 8 toolchain, without using above Makefile patch.

So I have feeling that this is either related to gcc 8 or to binutils.
On that Debian is binutils 2.31.1-16. Or maybe something in .config?
Can you send you exact .config ?
.config from the above test case is in the attachment.
Thanks
Christophe

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help