Re: [EXT] Re: [PATCH v5 6/9] task_isolation: arch/arm64: enable task... | linux-api

Re: [EXT] Re: [PATCH v5 6/9] task_isolation: arch/arm64: enable task isolation functionality

From: Alex Belits <hidden>
Date: 2020-12-04 00:39:21
Also in: linux-arch, linux-arm-kernel, lkml, netdev

On Wed, 2020-12-02 at 13:59 +0000, Mark Rutland wrote:

External Email

-------------------------------------------------------------------
---
Hi Alex,

On Mon, Nov 23, 2020 at 05:58:06PM +0000, Alex Belits wrote:

quoted

In do_notify_resume(), call
task_isolation_before_pending_work_check()
first, to report isolation breaking, then after handling all
pending
work, call task_isolation_start() for TIF_TASK_ISOLATION tasks.

Add _TIF_TASK_ISOLATION to _TIF_WORK_MASK, and _TIF_SYSCALL_WORK,
define local NOTIFY_RESUME_LOOP_FLAGS to check in the loop, since
we
don't clear _TIF_TASK_ISOLATION in the loop.

Early kernel entry code calls task_isolation_kernel_enter(). In
particular:

Vectors:
el1_sync -> el1_sync_handler() -> task_isolation_kernel_enter()
el1_irq -> asm_nmi_enter(), handle_arch_irq()
el1_error -> do_serror()
el0_sync -> el0_sync_handler()
el0_irq -> handle_arch_irq()
el0_error -> do_serror()
el0_sync_compat -> el0_sync_compat_handler()
el0_irq_compat -> handle_arch_irq()
el0_error_compat -> do_serror()

SDEI entry:
__sdei_asm_handler -> __sdei_handler() -> nmi_enter()

As a heads-up, the arm64 entry code is changing, as we found that our
lockdep, RCU, and context-tracking management wasn't quite right. I
have
a series of patches:

https://lore.kernel.org/r/20201130115950.22492-1-mark.rutland@arm.com (local)

... which are queued in the arm64 for-next/fixes branch. I intend to
have some further rework ready for the next cycle.

Thanks!

 I'd appreciate if you
could Cc me on any patches altering the arm64 entry code, as I have a
vested interest.

I will do that.

That was quite obviously broken if PROVE_LOCKING and NO_HZ_FULL were
chosen and context tracking was in use (e.g. with
CONTEXT_TRACKING_FORCE),

I am not yet sure about TRACE_IRQFLAGS, however NO_HZ_FULL and
CONTEXT_TRACKING have to be enabled for it to do anything.

I will check it with PROVE_LOCKING and your patches.

Entry code only adds an inline function that, if task isolation is
enabled, uses raw_local_irq_save() / raw_local_irq_restore(), low-level 
operations and accesses per-CPU variabled by offset, so at very least
it should not add any problems. Even raw_local_irq_save() /
raw_local_irq_restore() probably should be removed, however I wanted to
have something that can be safely called if by whatever reason
interrupts were enabled before kernel was fully entered.

 so I'm assuming that this series has not been
tested in that configuration. What sort of testing has this seen?

On various available arm64 hardware, with enabled

CONFIG_TASK_ISOLATION
CONFIG_NO_HZ_FULL
CONFIG_HIGH_RES_TIMERS

and disabled:

CONFIG_HZ_PERIODIC
CONFIG_NO_HZ_IDLE
CONFIG_NO_HZ

It would be very helpful for the next posting if you could provide
any
instructions on how to test this series (e.g. with pointers to any
test
suite that you have), since it's very easy to introduce subtle
breakage
in this area without realising it.

I will. Currently libtmc ( https://github.com/abelits/libtmc ) contains
all userspace code used for testing, however I should document the
testing procedures.

quoted

Functions called from there:
asm_nmi_enter() -> nmi_enter() -> task_isolation_kernel_enter()
asm_nmi_exit() -> nmi_exit() -> task_isolation_kernel_return()

Handlers:
do_serror() -> nmi_enter() -> task_isolation_kernel_enter()
  or task_isolation_kernel_enter()
el1_sync_handler() -> task_isolation_kernel_enter()
el0_sync_handler() -> task_isolation_kernel_enter()
el0_sync_compat_handler() -> task_isolation_kernel_enter()

handle_arch_irq() is irqchip-specific, most call
handle_domain_irq()
There is a separate patch for irqchips that do not follow this
rule.

handle_domain_irq() -> task_isolation_kernel_enter()
do_handle_IPI() -> task_isolation_kernel_enter() (may be redundant)
nmi_enter() -> task_isolation_kernel_enter()

The IRQ cases look very odd to me. With the rework I've just done for
arm64, we'll do the regular context tracking accounting before we
ever
get into handle_domain_irq() or similar, so I suspect that's not
necessary at all?

The goal is to call task_isolation_kernel_enter() before anything that
depends on a CPU state, including pipeline, that could remain un-
synchronized when the rest of the kernel was sending synchronization
IPIs. Similarly task_isolation_kernel_return() should be called when it
is safe to turn off synchronization. If rework allows it to be done
earlier, there is no need to touch more specific functions.

quoted hunk ↗ jump to hunk

--- a/arch/arm64/include/asm/barrier.h

quoted

+++ b/arch/arm64/include/asm/barrier.h

@@ -49,6 +49,7 @@
 #define dma_rmb()	dmb(oshld)
 #define dma_wmb()	dmb(oshst)
 
+#define instr_sync()	isb()

I think I've asked on prior versions of the patchset, but what is
this
for? Where is it going to be used, and what is the expected
semantics?
I'm wary of exposing this outside of arch code because there aren't
strong cross-architectural semantics, and at the least this requires
some documentation.

This is intended as an instruction pipeline flush for the situation
when arch-independent code has to synchronize with potential changes
that it missed. This is necessary after some other CPUs could modify
code (and send IPIs to notify the rest but not isolated CPU) while this
one was still running isolated task or, more likely, exiting from it,
so it might be unlucky enough to pick the old instructions before that
point.

It's only used on kernel entry.

If it's unused, please delete it.

[...]

quoted

diff --git a/arch/arm64/kernel/entry-common.c

b/arch/arm64/kernel/entry-common.c
index 43d4c329775f..8152760de683 100644

--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c

@@ -8,6 +8,7 @@
 #include <linux/context_tracking.h>
 #include <linux/ptrace.h>
 #include <linux/thread_info.h>
+#include <linux/isolation.h>
 
 #include <asm/cpufeature.h>
 #include <asm/daifflags.h>

@@ -77,6 +78,8 @@ asmlinkage void notrace el1_sync_handler(struct

pt_regs *regs)
 {
 	unsigned long esr = read_sysreg(esr_el1);
 
+	task_isolation_kernel_enter();

For regular context tracking we only acount the user<->kernel
transitions.

This is a kernel->kernel transition, so surely this is not necessary?

Right. If we entered kernel from an isolated task, we have already
changed the flags.

If nothing else, it doesn't feel well-balanced.

I havwe not looked at the rest of this patch (or series) in detail.

Thanks,
Mark.

My goal was to make sure that all transitions between kernel and
userspace are covered, so when in doubt I have added corresponding
calls those inline functions, and made them safe to be called from
those places. With improved entry-exit code it should be easier to be
sure where this can be done in a cleaner way.

-- 
Alex

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help