Thread (49 messages) 49 messages, 7 authors, 2021-01-22

Re: [EXT] Re: [PATCH v4 03/13] task_isolation: userspace hard isolation from kernel

From: Nitesh Narayan Lal <hidden>
Date: 2020-10-05 18:53:10
Also in: linux-api, linux-arch, lkml, netdev

On 10/4/20 7:14 PM, Frederic Weisbecker wrote:
On Sun, Oct 04, 2020 at 02:44:39PM +0000, Alex Belits wrote:
quoted
On Thu, 2020-10-01 at 15:56 +0200, Frederic Weisbecker wrote:
quoted
External Email

-------------------------------------------------------------------
---
On Wed, Jul 22, 2020 at 02:49:49PM +0000, Alex Belits wrote:
quoted
+/*
+ * Description of the last two tasks that ran isolated on a given
CPU.
+ * This is intended only for messages about isolation breaking. We
+ * don't want any references to actual task while accessing this
from
+ * CPU that caused isolation breaking -- we know nothing about
timing
+ * and don't want to use locking or RCU.
+ */
+struct isol_task_desc {
+	atomic_t curr_index;
+	atomic_t curr_index_wr;
+	bool	warned[2];
+	pid_t	pid[2];
+	pid_t	tgid[2];
+	char	comm[2][TASK_COMM_LEN];
+};
+static DEFINE_PER_CPU(struct isol_task_desc, isol_task_descs);
So that's quite a huge patch that would have needed to be split up.
Especially this tracing engine.

Speaking of which, I agree with Thomas that it's unnecessary. It's
too much
code and complexity. We can use the existing trace events and perform
the
analysis from userspace to find the source of the disturbance.
The idea behind this is that isolation breaking events are supposed to
be known to the applications while applications run normally, and they
should not require any analysis or human intervention to be handled.
Sure but you can use trace events for that. Just trace interrupts, workqueues,
timers, syscalls, exceptions and scheduler events and you get all the local
disturbance. You might want to tune a few filters but that's pretty much it.

As for the source of the disturbances, if you really need that information,
you can trace the workqueue and timer queue events and just filter those that
target your isolated CPUs.
I agree that we can do all those things with tracing.
However, IMHO having a simplified logging mechanism to gather the source of
violation may help in reducing the manual effort.

Although, I am not sure how easy will it be to maintain such an interface
over time.

--
Thanks
Nitesh
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help