Thread (14 messages) 14 messages, 2 authors, 2019-02-19

Re: [PATCH 2/6] arm64: HWCAP: add support for ELF_HWCAP2

From: Andrew Murray <hidden>
Date: 2019-02-19 16:53:10

On Mon, Feb 18, 2019 at 04:02:38PM +0000, Dave Martin wrote:
On Mon, Feb 18, 2019 at 03:32:06PM +0000, Andrew Murray wrote:
quoted
On Thu, Feb 07, 2019 at 11:31:24AM +0000, Dave Martin wrote:
quoted
On Wed, Feb 06, 2019 at 01:31:04PM +0000, Andrew Murray wrote:
quoted
As we will exhaust the first 32 bits of ELF_HWCAP let's start
exposing ELF_HWCAP2 to userspace.

Whilst it's possible to use the remaining 32 bits of ELF_HWCAP, we
prefer to expand into ELF_HWCAP2 in order to provide a consistent
view to userspace between ILP32 and LP64.

To reduce complexity and allow for future expansion, we now
represent hwcaps in the kernel as ordinals and use a
KERNEL_HWCAP_ prefix. This allows us to support automatic feature
based module loading for all our hwcaps.

We introduce cpu_set_feature to set hwcaps in the relevant
elf_hwcapX variable which compliments the existing cpu_have_feature
helper. These helpers allow us to clean up existing direct uses of
elf_hwcap.

Signed-off-by: Andrew Murray <redacted>
---
[...]
quoted
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index dfcfba7..15d939e 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -18,11 +18,10 @@
  * In the arm64 world (as in the ARM world), elf_hwcap is used both internally
  * in the kernel and for user space to keep track of which optional features
  * are supported by the current system. So let's map feature 'x' to HWCAP_x.
- * Note that HWCAP_x constants are bit fields so we need to take the log.
  */
 
-#define MAX_CPU_FEATURES	(8 * sizeof(elf_hwcap))
-#define cpu_feature(x)		ilog2(HWCAP_ ## x)
+#define MAX_CPU_FEATURES	(2 * 8 * sizeof(unsigned long))
Doesn't this make 2 * 8 * 8 == 128?

Maybe just say "64".  We open-code 32 later anyway.
I guess 128 would have been correct if we use all the bits of
each ELF_HWCAP{,2} - but we chose to limit each ELF_HWCAP to be
32 bits.

I'll set this to 64 as you suggest.
quoted
quoted
+#define cpu_feature(x)		(KERNEL_HWCAP_ ## x)
 
 #ifndef __ASSEMBLY__
 
@@ -396,9 +395,20 @@ extern struct static_key_false arm64_const_caps_ready;
 
 bool this_cpu_has_cap(unsigned int cap);
 
+static inline void cpu_set_feature(unsigned int num)
+{
+	if (num < 32)
+		elf_hwcap |= BIT(num);
+	else
+		elf_hwcap2 |= BIT(num - 32);
+}
+
 static inline bool cpu_have_feature(unsigned int num)
 {
-	return elf_hwcap & (1UL << num);
+	if (num < 32)
+		return elf_hwcap & BIT(num);
+	else
+		return elf_hwcap2 & BIT(num - 32);
 }
Nit: I worry a bit that people will blithely pass HWCAP_foo to these,
not realising that the interface has changed.
Yeah this would be an easy mistake to make.
quoted
Can we do something like the following:

static inline bool __cpu_have_feature(unsigned int num)
{
	if (WARN_ON(num >= 64))
		return false;

	if (num < 32)
		return elf_hwcap & BIT(num);
	else
		return elf_hwcap2 & BIT(num);
}

#define cpu_have_feature(name) __cpu_have_feature(cpu_feature(name))
It took me a while to figure out what the benefit of the above line
is - this means that callers of cpu_have_feature must use the unprefixed
feature name (e.g. SHA512), anything else such as (KERNEL_HWCAP_SHA512
or HWCAP_SHA512) would result in a compile time error. Whereas without
this users could pass KERNEL_HWCAP_x or HWCAP_x without compile errors
yet resulting in incorrectly passing a bitfield when using HWCAP_x.
That was the idea (apologies for being cryptic...)
:)
quoted
quoted
In compiletime-constant cases, the compiler should simply delete the
WARN.
Is this really needed? Surely this would only come into play when a user
directly calls __cpu_have_feature. I think this is overkill, especially
as __cpu_have_feature is a new function.
quoted
 A couple of cases in the cpufeatures and /proc/cpuinfo code would
need to use __cpu_have_feature directly and would retain the WARN, but
that's not fastpath and probably not the end of the world, plus we
get an additional debugging aid if that code goes wrong.
But even so, the WARN (for the cpuinfo case) is unlikely to ever be hit.
Sure, since we don't expect general purpose callers to use
__cpu_have_feature() directly it is a bit paranoid to include the WARN.
Given the contexts in which this is used, I wouldn't lose sleep if it
weren't there.
quoted
The challenge with this approach is print_cpu_modalias (drivers/base/cpu.c),
this also uses cpu_have_feature - this expects cpu_have_feature to take
an ordinal. How can we work around this?
quoted
Alternatively, we could bake cpu_feature() into HWCAP_CAP(), making
it impossible to specify invalid or user-encoded hwcaps there either.
Then we might justifiably omit the policing in __cpu_have_feature()
altogether.
This would certainly please checkpatch with respect to line lengths :)

However since this patchset, we now use cpu_have_feature elsewhere, such
as in the crypto drivers.
Hmmm, those are irksome issues.

It would be nice to split the ordinal versus symbolic cpu_have_feature()
into separate macros, but that would involve a change to code outside
arch/arm64.  It's almost certainly not worth the pain.


Can you take a look at arch/arm?  That already has hwcap2.
It may makes some sense to follow a common approach, although I'm not
sure how

	#define __hwcap_feature(x)      ilog2(HWCAP_ ## x)
	#define __hwcap2_feature(x)     (32 + ilog2(HWCAP2_ ## x))
	#define cpu_feature(x)          __hwcap2_feature(x)

is supposed to work for cpu_feature(SWP) for example...  Maybe I missed
something.
The limitation with the arm32 approach is that you can only support automatic
module loading with features covered in HWCAP2. We can avoid this.

For arm32 cpu_feature(AES) will only match within HWCAP2, you would get
unintended behaviour if you used a feature name from HWCAP such as NEON.

Thus cpu_have_feature(cpu_feature(X)) is limited to HWCAP2.

Prior to this patchset cpu_have_feature had a sole user and that is
print_cpu_modalias. CPU features are normally tested by directly comparing
against elf_hwcap.
quoted
quoted
quoted
 /* System capability check for constant caps */
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 400b80b..05ee9b9 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -39,12 +39,49 @@
[...]
quoted
quoted
quoted
+#define KERNEL_HWCAP_SSBS              ilog2(HWCAP_SSBS)
+#define KERNEL_HWCAP_SB                ilog2(HWCAP_SB)
+#define KERNEL_HWCAP_PACA              ilog2(HWCAP_PACA)
+#define KERNEL_HWCAP_PACG              ilog2(HWCAP_PACG)
Nit: Odd spacing?  Personally I wouldn't indent with spaces just so that
(ilog2() + 32) lines up nicely, but that's just my opinion and not a
huge deal.

(The spacing can make subsequent diffs look jagged by default, which
can be a distraction for reviewers.)
I'll convert to tabs, I probably copy pasted and the tabs got converted.
quoted
quoted
+
 #ifndef __ASSEMBLY__
 /*
  * This yields a mask that user programs can use to figure out what
  * instruction set this cpu supports.
  */
-#define ELF_HWCAP		(elf_hwcap)
+#define ELF_HWCAP		elf_hwcap
+#define ELF_HWCAP2		elf_hwcap2
Do we need elf_hwcap2 at all?

We could just have these macros fish the bits out of a single variable,
or wherever.

However, it may be better to keep things this way if we want to maintain
the elf_hwcap in the kernel/module ABI.  See note on EXPORT_SYMBOL_GPL()
removal for elf_hwcap in the subsequent patch.
OK, I'll leave it as it is.
quoted
quoted
 
 #ifdef CONFIG_COMPAT
 #define COMPAT_ELF_HWCAP	(compat_elf_hwcap)
@@ -60,6 +97,6 @@ enum {
 #endif
 };
 
-extern unsigned long elf_hwcap;
+extern unsigned long elf_hwcap, elf_hwcap2;
 #endif
 #endif
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index 5f0750c..453b45a 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -18,7 +18,7 @@
 #define _UAPI__ASM_HWCAP_H
 
 /*
- * HWCAP flags - for elf_hwcap (in kernel) and AT_HWCAP
+ * HWCAP flags - for AT_HWCAP
  */
 #define HWCAP_FP		(1 << 0)
 #define HWCAP_ASIMD		(1 << 1)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index f6d84e2..d10a455 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -38,6 +38,9 @@
 unsigned long elf_hwcap __read_mostly;
 EXPORT_SYMBOL_GPL(elf_hwcap);
 
+unsigned long elf_hwcap2 __read_mostly;
+EXPORT_SYMBOL_GPL(elf_hwcap2);
+
Historically nobody used this because it didn't exist, and we don't want
anybody to use it (since cpu_have_feature() etc. is the preferred
interface).
In that case I should probably EXPORT_SYMBOL_GPL cpu_have_feature.
quoted
So lose EXPORT_SYMBOL_GPL() here.  (You remove it in any case in a
subsequent patch, but having it even transiently sends mixed messages
about the intent.)
No problem.
Hmmm, since this inline in a header we'd have to EXPORT_SYMBOL_GPL the
things it depends on (elf_hwcap, elf_hwcap2) rather than the (inline)
function itself.

So maybe we can't do better than the current situation after all.

If so, I guess exporting elf_hwcap2 as GPL is OK.
Actually my patchset moves the cpu_have_feature function a C file. So I could
add a EXPORT_SYMBOL_GPL to this and the other functions I introduce
(cpu_set_feature, cpu_get_elf_hwcap{,2}).

Unless you have objections I'll respin with this approach.

Andrew Murray
Cheers
---Dave
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help