Thread (22 messages) 22 messages, 8 authors, 2016-02-25

Re: [PATCH 3/4] sched/deadline: Tracepoints for deadline scheduler

From: Daniel Bristot de Oliveira <hidden>
Date: 2016-02-23 16:19:23
Also in: lkml


On 02/23/2016 07:44 AM, Peter Zijlstra wrote:
quoted
quoted
Worse, the proposed tracepoints are atrocious, look at crap like this:
quoted
quoted
quoted
quoted
+		if (trace_sched_deadline_yield_enabled()) {
+			u64 delta_exec = rq_clock_task(rq) - p->se.exec_start;
+			/* Subtract the last run till now */
+			if (likely((s64)delta_exec > 0))
+				p->dl.runtime -= delta_exec;
+			trace_sched_deadline_yield(&p->dl);
+		}  
tracepoints should _NEVER_ change state, ever.
Heh, it's not really changing state. The code directly after this is:

	p->dl.runtime = 0;
Yes, it more or less 'works', but its still atrocious shite. Its the
worst kind of anti pattern possible.

Suppose someone comes and removes that line, and ignores the tracepoint
stuff, because, hell its a tracepoint, those don't modify stuff.

Its just really, utterly bad practice.
It is possible to clean up this by removing the "p->dl.runtime = 0;"
from yield_task_dl(), to carry the dl_runtime until update_curr_dl().
We end up not proposing this change because we thought it could be
too much intrusive. But seems that it was the correct approach.

Btw, your patch for "Always calculate end of period on sched_yield()"
already do the necessary clean up, so in the possible next version of
these tracepoint the hack will disappear. 
quoted
quoted
quoted
quoted
So tell me why these specific tracepoints and why the existing ones
could not be extended to include this information. For example, why a
trace_sched_dealine_yield, and not a generic trace_sched_yield() that
works for all classes.
But what about reporting actual runtime, and when the next period will
come. That only matters for deadline.
How is that an answer to the question? Are you implying a generic
trace_sched_yield() call could not do this?
I agree that a trace_sched_yield() could partially do this. But each
scheduler would have its own specific data to be printed, and it is hard
to define how many and the type of these data.
quoted
quoted
quoted
quoted
But do not present me with a bunch of random arse, hacked together
tracepoints and tell me they might be useful, maybe.

They ARE useful. These are the tracepoints I'm currently using to
debug the deadline scheduler with. They have been indispensable for my
current work.
They are, most obviously, a hacked together debug session for sure. This
is _NOT_ what you commit.

Now ideally we'd do something like the below, but because trainwreck, we
cannot actually do this I think :-(
I do liked this patch, but I agree that bad things can happen on
applications that already use sched:sched_switch :-(.
 
I really dislike tracepoints, and I'm >.< close to proposing a patch
removing them all from the scheduler.
Scheduler tracepoints become such a standard not only for debugging the
kernel, but also for understanding performance issues and finding
optimizations for user-space tasks, e.g. finding the best way to spread
task among processors on a large NUMA box. Many people would cry
if sched tracepoints disappears :'-(.
quoted hunk ↗ jump to hunk
@@ -132,33 +177,63 @@ TRACE_EVENT(sched_switch,
 	TP_STRUCT__entry(
 		__array(	char,	prev_comm,	TASK_COMM_LEN	)
 		__field(	pid_t,	prev_pid			)
-		__field(	int,	prev_prio			)
+		__field(	int,	prev_policy			)
+		__field(	s64,	prev_f1				)
+		__field(	u64,	prev_f2				)
+		__field(	u64,	prev_f3				)
This helps me to explain why did I chose to create deadline specific
tracepoints.

It is not possible to define in advance a common set of parameters to be
traced/printed for all future (exotic) scheduler, because each scheduler
will use its own set of parameters. Hence, the number and type of the
parameter can vary. How many fields do we need? and which are the types
of these fields? What if a scheduler needs two s64 fields and no u64?

Moreover, it is not possible to define the changes that will be needed
on classical schedulers algorithms for them to fit on Linux's
abstractions and restrictions. Therefore, it is not safe to mention
that LLF would (theoretically) use the same set of parameters of a EDF
scheduler, because the theoretical LLF would need to be modified
to fit on Linux. For example, there is no throttling concept on the
classical deadline scheduler. So, the proposed tracepoints are only valid
for Linux's sched deadline scheduler (deadline scheduler + CBS).

That is why I believe that it would be a better approach to have
scheduler specific tracepoints for the sched_*_entity particularities
and sched_* specific points interest.

Finally, I was based in the same approach used on the fair scheduler's
sched:sched_stat* tracepoints: they report sched_entity data on fair.c,
and my patches report sched_deadline_entity data on deadline.c.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help