Thread (22 messages) 22 messages, 8 authors, 2018-07-03
STALE2918d

[PATCH] arm64/acpi: Add fixup for HPE m400 quirks

From: james.morse@arm.com (James Morse)
Date: 2018-06-28 10:06:00
Also in: linux-acpi

Hi Mark,

On 26/06/18 21:20, Mark Salter wrote:
On Tue, 2018-06-26 at 15:51 +0100, James Morse wrote:
quoted
On 25/06/18 16:34, Mark Salter wrote:
quoted
On Fri, 2018-06-22 at 11:19 -0400, Mark Salter wrote:
quoted
I'm going to hack something to get to the ghes info earlier in boot and
check the things you mention above wrt Error Status Block and GHES.0.
So I had to end up instrumenting the EFI stub to see where the error came
from. At the start of the stub, there is no GHES.2 error. The error first
shows up after the stub's call to ExitBootServices returns.
What's the notification type of GHES.2? I'm guessing POLLed or some kind of IRQ.
quoted
These systems don't have EL3, so the CPU must continue running while something
external generates the CPER records. The records being visible is the last point
the faulty-access could have been made, with the window of time depending on how
fast this external-thing receives and processes the error.
There's a System Control Processor (slimpro) on the SoC which can interact with
the CPU in various ways and which has access to memory and other hw.
Thanks, saves me guessing!

quoted
quoted
So it looks
like the firmware itself is causing the error. There's still a chance that
the stub is doing something wrong with the memory map passed to the
firmware, so I'll try to eliminate that as well.
adding delay loops will help prove the EFIStub is innocent.
Didn't change anything.
Okay, so just to clarify, a delay before ExitBootServices doesn't cause the
error to show up before ExitBootServices, so the error hasn't occurred prior to
this point.
And a delay after ExitBootServices allows us to see the error before we exit
into head.S. (this rules out a bug in head.S)
The delays should be long enough to tell us this slimpro isn't generating the
error records N seconds after reset.

Given this I agree we should disable_hest based on the DMI platform name and the
UEFI version number. (it may be earlier firmware didn't have this bug).


I don't have anything to test this on, so I've picked the DMI strings out the
demsg output on that bugzilla entry. Any chance you could give it a test?

quoted
Are redhat able to rebuild UEFI on these systems? (Can it be fixed?)
quoted
https://bugzilla.redhat.com/show_bug.cgi?id=1285107 is about the m400
description of the GIC, comments 15 and 16 show a UEFI patch to something other
than the upstream platforms tree[0], and new firmware being tested.
(although this may be wishful thinking)
HPe would respond to bug reports until m400 reached EOL. They have been pretty
clear that no more firmware updates will be done.
Thanks, it was a bit murky from that ticket...


Thanks for doing this!

James
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help