Thread (21 messages) 21 messages, 4 authors, 2016-11-29

[PATCH V5 03/10] efi: parse ARMv8 processor error

From: james.morse@arm.com (James Morse)
Date: 2016-11-25 18:32:57
Also in: kvm, kvmarm, linux-acpi, linux-efi, lkml

Hi Tyler,

On 21/11/16 22:35, Tyler Baicar wrote:
Add support for ARMv8 Common Platform Error Record (CPER).
UEFI 2.6 specification adds support for ARMv8 specific
processor error information to be reported as part of the
CPER records. This provides more detail on for processor error logs.
I think I'm missing a big part of the puzzle here, I will come back to this next
week. I can't quite line up some of the masks and shifts with the table
descriptions in the UEFI spec[0].

quoted hunk ↗ jump to hunk
diff --git a/include/linux/cper.h b/include/linux/cper.h
index 13ea41c..2a9d553 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
quoted hunk ↗ jump to hunk
@@ -180,6 +185,10 @@ enum {
 #define CPER_SEC_PROC_IPF						\
 	UUID_LE(0xE429FAF1, 0x3CB7, 0x11D4, 0x0B, 0xCA, 0x07, 0x00,	\
 		0x80, 0xC7, 0x3C, 0x88, 0x81)
+/* Processor Specific: ARMv8 */
+#define CPER_SEC_PROC_ARMV8						\
+	UUID_LE(0xE19E3D16, 0xBC11, 0x11E4, 0x9C, 0xAA, 0xC2, 0x05,	\
+		0x1D, 0x5D, 0x46, 0xB0)
Nit: UEFI v2.6 N.2.2 (table 249) describes this as 'ARM' not 'ARMV8' (which is
an architectural version).

quoted hunk ↗ jump to hunk
 /* Platform Memory */
 #define CPER_SEC_PLATFORM_MEM						\
 	UUID_LE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83,	\
@@ -255,6 +264,34 @@ enum {
 
 #define CPER_PCIE_SLOT_SHIFT			3
 
+#define CPER_ARMV8_ERR_INFO_NUM_MASK		0x00000000000000FF
+#define CPER_ARMV8_CTX_INFO_NUM_MASK		0x0000000000FFFF00
Table 260 describes both ERR_INFO_NUM and CONTEXT_INFO_NUM for as both being
2bytes long, as does your struct cper_sec_proc_armv8 below. Are these for
something else? Do these correspond with one of the four bitfield formats
described in Table 262->265?

I can't see where they are used, and they look like they are reaching across
multiple fields in a struct.

+#define CPER_ARMV8_CTX_INFO_NUM_SHIFT		8
+
+#define CPER_ARMV8_VALID_MPIDR			0x00000001
+#define CPER_ARMV8_VALID_AFFINITY_LEVEL		0x00000002
+#define CPER_ARMV8_VALID_RUNNING_STATE		0x00000004
+#define CPER_ARMV8_VALID_VENDOR_INFO		0x00000008
+
+#define CPER_ARMV8_INFO_VALID_MULTI_ERR		0x0001
+#define CPER_ARMV8_INFO_VALID_FLAGS		0x0002
+#define CPER_ARMV8_INFO_VALID_ERR_INFO		0x0004
+#define CPER_ARMV8_INFO_VALID_VIRT_ADDR		0x0008
+#define CPER_ARMV8_INFO_VALID_PHYSICAL_ADDR	0x0010
+
+#define CPER_ARMV8_INFO_FLAGS_FIRST		0x0001
+#define CPER_ARMV8_INFO_FLAGS_LAST		0x0002
+#define CPER_ARMV8_INFO_FLAGS_PROPAGATED	0x0004
+
+#define CPER_AARCH64_CTX_LEN			368
+#define CPER_AARCH32_CTX_LEN			256
Are these the worst case sizes for combinations of the structures in N2.4.4.2?
(Tables 266 to 273)

If so is there any chance they could be sizeof(<some union of structs>), even if
the structs are things like:
/* ARMv8 AArch64 GPRs (Type 4) - defined in UEFI Spec N2.4.4.2 */
struct cper_armv8_aarch64_gprs {
	u64 regs[32];
}
This way its easier to check the number is correct, and if a new type is added
this won't get forgotten.

+#define CPER_ARMV8_CTX_TYPE_MASK		0x000000000000000F
+#define CPER_ARMV8_CTX_EL_MASK			0x0000000000000070
+#define CPER_ARMV8_CTX_NS_MASK			0x0000000000000080
+#define CPER_ARMV8_CTX_EL_SHIFT			4
+#define CPER_ARMV8_CTX_NS_SHIFT			7
+
Again, I can't work out what these correspond to. I can't see a secure bit or EL
field in any of those UEFI tables.

Is this one of the 'ARM Vendor Specific Micro-Architecture Error Structure's? If
so we should have some infrastructure for picking the correct (or unknown)
decode function based on a range of MIDRs.

quoted hunk ↗ jump to hunk
 #define acpi_hest_generic_data_error_length(gdata)	\
 	(((struct acpi_hest_generic_data *)(gdata))->error_data_length)
 #define acpi_hest_generic_data_size(gdata)		\
@@ -352,6 +389,41 @@ struct cper_ia_proc_ctx {
 	__u64	mm_reg_addr;
 };
 
+/* ARMv8 Processor Error Section */
+struct cper_sec_proc_armv8 {
+	__u32	validation_bits;
+	__u16	err_info_num; /* Number of Processor Error Info */
+	__u16	context_info_num; /* Number of Processor Context Info Records*/
+	__u32	section_length;
+	__u8	affinity_level;
+	__u8	reserved[3];	/* must be zero */
+	__u64	mpidr;
+	__u64	midr;
+	__u32	running_state; /* Bit 0 set - Processor running. PSCI = 0 */
+	__u32	psci_state;
+};
+
+/* ARMv8 Processor Error Information Structure */
+struct cper_armv8_err_info {
+	__u8	version;
+	__u8	length;
+	__u16	validation_bits;
+	__u8	type;
+	__u16	multiple_error;
+	__u8	flags;
+	__u64	error_info;
+	__u64	virt_fault_addr;
+	__u64	physical_fault_addr;
+};
+/* ARMv8 AARCH64 Processor Context Information Structure */
+struct cper_armv8_aarch64_ctx {
+	__u8	type_el_ns;
+	__u8	reserved[7];	/* must be zero */
+	__u8	gpr[288];
+	__u8	spr[68];
+};
Is this:
"Table 265. ARM Processor Error Context Information Header Structure"?


Sorry if I've missed something blindingly obvious!


Thanks,

James

[0] http://www.uefi.org/sites/default/files/resources/UEFI%20Spec%202_6.pdf
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help