[PATCH v2 16/28] arm64/sve: Probe SVE capabilities and usable vector lengths
From: Dave.Martin@arm.com (Dave Martin)
Date: 2017-09-28 14:22:12
Also in:
kvmarm, linux-arch
On Thu, Sep 14, 2017 at 10:45:07AM +0100, Alex Benn?e wrote:
Dave Martin [off-list ref] writes:quoted
This patch uses the cpufeatures framework to determine common SVE capabilities and vector lengths, and configures the runtime SVE support code appropriately. ZCR_ELx is not really a feature register, but it is convenient to use it as a template for recording the maximum vector length supported by a CPU, using the LEN field. This field is similar to a feature field in that it is a contiguous bitfield for which we want to determine the minimum system-wide value. This patch adds ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate custom code to populate it. Finding the minimum supported value of the LEN field is left to the cpufeatures framework in the usual way. The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet, so for now we just require it to be zero. Note that much of this code is dormant and SVE still won't be used yet, since system_supports_sve() remains hardwired to false. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Cc: Alex Benn?e <redacted> Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com> --- Changes since v1 ---------------- Requested by Alex Benn?e: * Thin out BUG_ON()s: Redundant BUG_ON()s and ones that just check invariants are removed. Important sanity-checks are migrated to WARN_ON()s, with some minimal best-effort patch-up code. Other changes related to Alex Benn?e's comments: * Migrate away from magic numbers for converting VL to VQ. Requested by Suzuki Poulose: * Make sve_vq_map __ro_after_init. Other changes related to Suzuki Poulose's comments: * Rely on cpufeatures for not attempting to update the vq map after boot. --- arch/arm64/include/asm/cpu.h | 4 ++ arch/arm64/include/asm/cpufeature.h | 29 ++++++++++ arch/arm64/include/asm/fpsimd.h | 10 ++++ arch/arm64/kernel/cpufeature.c | 50 +++++++++++++++++ arch/arm64/kernel/cpuinfo.c | 6 ++ arch/arm64/kernel/fpsimd.c | 106 +++++++++++++++++++++++++++++++++++- 6 files changed, 202 insertions(+), 3 deletions(-)diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h index 889226b..8839227 100644 --- a/arch/arm64/include/asm/cpu.h +++ b/arch/arm64/include/asm/cpu.h@@ -41,6 +41,7 @@ struct cpuinfo_arm64 { u64 reg_id_aa64mmfr2; u64 reg_id_aa64pfr0; u64 reg_id_aa64pfr1; + u64 reg_id_aa64zfr0; u32 reg_id_dfr0; u32 reg_id_isar0;@@ -59,6 +60,9 @@ struct cpuinfo_arm64 { u32 reg_mvfr0; u32 reg_mvfr1; u32 reg_mvfr2; + + /* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */ + u64 reg_zcr; }; DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 4ea3441..d98e7ba 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h@@ -10,7 +10,9 @@ #define __ASM_CPUFEATURE_H #include <asm/cpucaps.h> +#include <asm/fpsimd.h> #include <asm/hwcap.h> +#include <asm/sigcontext.h> #include <asm/sysreg.h> /*@@ -223,6 +225,13 @@ static inline bool id_aa64pfr0_32bit_el0(u64 pfr0) return val == ID_AA64PFR0_EL0_32BIT_64BIT; } +static inline bool id_aa64pfr0_sve(u64 pfr0) +{ + u32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_SVE_SHIFT); + + return val > 0; +} + void __init setup_cpu_features(void); void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,@@ -267,6 +276,26 @@ static inline bool system_supports_sve(void) return false; } +/* + * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE + * vector length. + * Use only if SVE is present. This function clobbers the SVE vector length. + */:nit whitespace formatting.
I'll add some newlines now to make this cleaner. /* * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE * vector length. * * Use only if SVE is present. * This function clobbers the SVE vector length. */ OK?
quoted
+static u64 __maybe_unused read_zcr_features(void) +{ + u64 zcr; + unsigned int vq_max; + + write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);I'm confused, why are we writing something here? You mention clobbering the SVE vector length but what was the point?
Hmm, this deserves a comment -- coming back to this code, I had to think about it. Are the following extra comments sufficient explanation? /* * Set the maximum possible VL, and write zeroes to all other * bits to see if they stick. */ write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1); zcr = read_sysreg_s(SYS_ZCR_EL1); zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* flag up sticky 1s outside LEN field */ vq_max = sve_vq_from_vl(sve_get_vl()); zcr |= vq_max - 1; /* set LEN field to maximum effective value */ [...]
Otherwise: Reviewed-by: Alex Benn?e <redacted>
I'll wait on your responses to the above first. Cheers ---Dave