Thread (100 messages) 100 messages, 8 authors, 2017-10-09
STALE3161d

[PATCH v2 11/28] arm64/sve: Core task context handling

From: Dave.Martin@arm.com (Dave Martin)
Date: 2017-09-14 19:40:41
Also in: kvmarm, linux-arch

On Wed, Sep 13, 2017 at 03:21:29PM -0700, Catalin Marinas wrote:
On Wed, Sep 13, 2017 at 08:17:07PM +0100, Dave P Martin wrote:
quoted
On Wed, Sep 13, 2017 at 10:26:05AM -0700, Catalin Marinas wrote:
quoted
On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:
quoted
+/*
+ * Trapped SVE access
+ */
+void do_sve_acc(unsigned int esr, struct pt_regs *regs)
+{
+	/* Even if we chose not to use SVE, the hardware could still trap: */
+	if (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {
+		force_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);
+		return;
+	}
+
+	task_fpsimd_save();
+
+	sve_alloc(current);
+	fpsimd_to_sve(current);
+	if (test_and_set_thread_flag(TIF_SVE))
+		WARN_ON(1); /* SVE access shouldn't have trapped */
+
+	task_fpsimd_load();
+}
When this function is entered, do we expect TIF_SVE to always be
cleared? It's worth adding a comment on the expected conditions. If
Yes, and this is required for correctness, as you observe.

I had a BUG_ON() here which I removed, but it makes sense to add a
comment to capture the precondition here, and how it is satisfied.
quoted
that's the case, task_fpsimd_save() would only save the FPSIMD state
which is fine. However, you subsequently transfer the FPSIMD state to
SVE, set TIF_SVE and restore the full SVE state. If we don't care about
the SVE state here, can we call task_fpsimd_load() *before* setting
TIF_SVE?
There should be no way to reach this code with TIF_SVE set, unless
task_fpsimd_load() sets the CPACR trap bit wrongly, or the hardware is
broken -- either of which is a bug.
Thanks for confirming my assumptions. What I meant was rewriting the
above function as:

	/* reset the SVE state (other than FPSIMD) */
	task_fpsimd_save();
	task_fpsimd_load();
I think this works, but can you explain your rationale?

I think the main effect of your suggestion is that it is cheaper, due
to eliminating some unnecessary load/store operations.

We could go one better, and do

	mov	v0.16b, v0.16b
	mov	v1.16b, v1.16b
	// ...
	mov	v31.16b, v31.16b

which doesn't require any memory access.

But I still prefer to zero p0..p15, ffr for cleanliness, even though
the SVE programmer's model doesn't require this (unlike for the Z-reg
high bits where we do need to zero them in order not to violate the
programmer's model).

Currently sve_alloc()+task_fpsimd_load() ensures that all the non-FPSIMD
regs are zeroed too, in addition to the Z-reg high bits.

So we might want a special-purpose helper -- if so, we can do it all
with no memory access.

	pfalse	p0.b
	// ..
	pfalse	p15.b
	wrffr	p0.b

This would allow the memset-zero an sve_alloc() to be removed, but I
would need to check what other code is relying on it.

I guess I hadn't done this because I viewed it as an optimisation.

Thoughts?

Cheers
---Dave
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help