Thread (22 messages) 22 messages, 8 authors, 2018-07-03
STALE2918d

[PATCH] arm64/acpi: Add fixup for HPE m400 quirks

From: james.morse@arm.com (James Morse)
Date: 2018-06-15 11:14:59
Also in: linux-acpi

Hi Geoff,

On 13/06/18 19:22, Geoff Levand wrote:
Adds a new ACPI init routine acpi_fixup_m400_quirks that adds
a work-around for HPE ProLiant m400 APEI firmware problems.

The work-around disables APEI when CONFIG_ACPI_APEI is set and
m400 firmware is detected.  Without this fixup m400 systems
experience errors like these on startup:

  [Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 2
  [Hardware Error]: event severity: fatal
  [Hardware Error]:  Error 0, type: fatal
  [Hardware Error]:   section_type: memory error
  [Hardware Error]:   error_status: 0x0000000000001300
"Access to a memory address which is not mapped to any component"

  [Hardware Error]:   error_type: 10, invalid address
  Kernel panic - not syncing: Fatal hardware error!
Why is this a problem?

Surely this is a valid description of an error.
(okay its not particularly useful without the physical address, but the address
is optional in that structure)

When does this happen during boot? This looks like a driver mapping some
non-existent physical address space to see if its device is present...
unsurprisingly this doesn't go well.
(might also be a typo in the DSDT)

Can't we pin down the driver that does this and fix it. Its either wrong for
everyone, or still broken after you disable APEI.

It seems unlikely there will be any m400 firmware updates to fix
this problem.
What is the problem? This patch looks like it shoots the messenger for bringing
bad news.

quoted hunk ↗ jump to hunk
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index 7b09487ff8fb..3c315c2c7476 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -31,6 +31,8 @@
 #include <asm/cpu_ops.h>
 #include <asm/smp_plat.h>
 
+#include <acpi/apei.h>
+
 #ifdef CONFIG_ACPI_APEI
 # include <linux/efi.h>
 # include <asm/pgtable.h>
@@ -177,6 +179,33 @@ static int __init acpi_fadt_sanity_check(void)
 	return ret;
 }
 
+/*
+ * acpi_fixup_m400_quirks - Work-around for HPE ProLiant m400 APEI firmware
+ * problems.
+ */
+static void __init acpi_fixup_m400_quirks(void)
+{
+	acpi_status status;
+	struct acpi_table_header *header;
+#if !defined(CONFIG_ACPI_APEI)
+	int hest_disable = HEST_DISABLED;
+#endif
Yuck.

+
+	if (!IS_ENABLED(CONFIG_ACPI_APEI) || hest_disable != HEST_ENABLED)
+		return;
+
+	status = acpi_get_table(ACPI_SIG_HEST, 0, &header);
+
+	if (ACPI_SUCCESS(status) && !strncmp(header->oem_id, "HPE   ", 6) &&
+		!strncmp(header->oem_table_id, "ProLiant", 8) &&
You should match the affected range of OEM table revisions too, that way a
firmware upgrade should start working, instead of being permanently disabled
because we think its unlikely.

+		MIDR_IMPLEMENTOR(read_cpuid_id()) == ARM_CPU_IMP_APM) {
How is the CPU implementer relevant?

You suggest a firmware-update would make this issue go away...

+		hest_disable = HEST_DISABLED;
+		pr_info("Disabled APEI for m400.\n");
+	}
+
+	acpi_put_table(header);
+}
+
 /*
  * acpi_boot_table_init() called from setup_arch(), always.
  *	1. find RSDP and get its address, and then find XSDT
Nothing arch-specific here. You're adding this to arch/arm64 because
drivers/acpi/apei doesn't have an existing quirks table?


Thanks,

James
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help