Thread (16 messages) 16 messages, 5 authors, 2022-08-02

Re: Regression: Linux v5.15+ does not boot on Freescale P2020

From: Segher Boessenkool <hidden>
Date: 2022-07-26 13:45:13
Also in: lkml

On Tue, Jul 26, 2022 at 11:02:59AM +0200, Arnd Bergmann wrote:
On Tue, Jul 26, 2022 at 10:34 AM Pali Rohár [off-list ref] wrote:
quoted
On Monday 25 July 2022 16:54:16 Segher Boessenkool wrote:
quoted
The EH field in larx insns is new since ISA 2.05, and some ISA 1.x cpu
implementations actually raise an illegal insn exception on EH=1.  It
appears P2020 is one of those.
P2020 has e500 cores. e500 cores uses ISA 2.03. So this may be reason.
But in official Freescale/NXP documentation for e500 is documented that
lwarx supports also eh=1. Maybe it is not really supported.
https://www.nxp.com/files-static/32bit/doc/ref_manual/EREF_RM.pdf (page 562)
(page 6-186)
quoted
At least there is NOTE:
Some older processors may treat EH=1 as an illegal instruction.
And the architecture says
  Programming Note
  Warning: On some processors that comply with versions of the
  architecture that precede Version 2.00, executing a Load And Reserve
  instruction in which EH = 1 will cause the illegal instruction error
  handler to be invoked.
In commit d6ccb1f55ddf ("powerpc/85xx: Make sure lwarx hint isn't set on ppc32")
this was clarified to affect (all?) e500v1/v2,
  e500v1/v2 based chips will treat any reserved field being set in an
  opcode as illegal.

while the architecture says

  Reserved fields in instructions are ignored by the processor.

Whoops :-)  We need fixes for processor implementation bugs all the
time of course, but this is a massive *design* bug.  I'm surprised this
CPU still works as well as it does!

Even the venerable PEM (last updated in 1997) shows the EH field as
reserved, always treated as 0.
this one apparently
fixed it before,
but Christophe's commit effectively reverted that change.

I think only the simple_spinlock.h file actually uses EH=1
That's right afaics.
and this is not
included in non-SMP kernels, so presumably the only affected machines were
the rare dual-core e500v2 ones (p2020, MPC8572, bsc9132), which would
explain why nobody noticed for the past 9 months.
Also people using an SMP kernel on older cores should see the problem,
no?  Or is that patched out?  Or does this use case never happen :-)


Segher
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help