Thread (50 messages) 50 messages, 4 authors, 2021-06-15

Re: [PATCH V3 9/9] tracing: Add timerlat tracer

From: Steven Rostedt <rostedt@goodmis.org>
Date: 2021-06-11 20:49:08
Also in: lkml

On Fri, 11 Jun 2021 16:13:36 +0200
Daniel Bristot de Oliveira [off-list ref] wrote:
quoted
quoted
+
+#ifdef CONFIG_STACKTRACE
+/*
+ * Stack trace will take place only at IRQ level, so, no need
+ * to control nesting here.
+ */
+struct trace_stack {
+	int stack_size;
+	int nr_entries;
+	unsigned long           calls[PAGE_SIZE];  
That is rather big. It's 8 * PAGE_SIZE. I don't think that's what you really
wanted.  
no, I did not want that... is 256 a good number?
Sure. But make it a macro.

#define MAX_CALLS 256

or something like that.
quoted
quoted
+};
+
+static DEFINE_PER_CPU(struct trace_stack, trace_stack);
+
+/**  
Again, remove the KernelDoc notation of /**, or make it real kerneldoc
notation.  
Fixed!

[...]
quoted
quoted
  *
@@ -801,6 +1017,22 @@ void trace_softirq_exit_callback(void *data, unsigned int vec_nr)
 	if (!osn_var->sampling)
 		return;
 
+#ifdef CONFIG_TIMERLAT_TRACER
+	/*
+	 * If the timerlat is enabled, but the irq handler did
+	 * not run yet enabling timerlat_tracer, do not trace.
+	 */
+	if (unlikely(osnoise_data.timerlat_tracer)) {
+		struct timerlat_variables *tlat_var;
+		tlat_var = this_cpu_tmr_var();
+		if (!tlat_var->tracing_thread) {  
What happens if the timer interrupt triggers here?  
The tracer will not report the softirq overhead. But at this point, the softirq
is returning, and the duration would be from this time to...


quoted
quoted
+			osn_var->softirq.arrival_time = 0;
+			osn_var->softirq.delta_start = 0;
+			return;
+		}
+	}
+#endif
+
 	duration = get_int_safe_duration(osn_var, &osn_var->softirq.delta_start);  
here.

We can disable interrupts to avoid this issue. But the question is, is it worth
to disable interrupts to avoid this problem?
quoted
quoted
 	trace_softirq_noise(vec_nr, osn_var->softirq.arrival_time, duration);
 	cond_move_thread_delta_start(osn_var, duration);
@@ -893,6 +1125,18 @@ thread_exit(struct osnoise_variables *osn_var, struct task_struct *t)
 	if (!osn_var->sampling)
 		return;
 
+#ifdef CONFIG_TIMERLAT_TRACER
+	if (osnoise_data.timerlat_tracer) {
+		struct timerlat_variables *tlat_var;
+		tlat_var = this_cpu_tmr_var();
+		if (!tlat_var->tracing_thread) {  
Or here?  
The problem that can happen with the softirq cannot happen here: this code runs
with interrupts disabled on __schedule() (it is hooked to the sched_switch).
quoted
quoted
+			osn_var->thread.delta_start = 0;
+			osn_var->thread.arrival_time = 0;
+			return;
+		}
+	}
+#endif
+
 	duration = get_int_safe_duration(osn_var, &osn_var->thread.delta_start);
 
 	trace_thread_noise(t, osn_var->thread.arrival_time, duration);
@@ -1182,6 +1426,197 @@ static int osnoise_main(void *data)
 	return 0;
 }
 
+#ifdef CONFIG_TIMERLAT_TRACER
+/**
+ * timerlat_irq - hrtimer handler for timerlat.
+ */
+static enum hrtimer_restart timerlat_irq(struct hrtimer *timer)
+{
+	struct osnoise_variables *osn_var = this_cpu_osn_var();
+	struct trace_array *tr = osnoise_trace;
+	struct timerlat_variables *tlat;
+	struct timerlat_sample s;
+	u64 now;
+	u64 diff;
+
+	/*
+	 * I am not sure if the timer was armed for this CPU. So, get
+	 * the timerlat struct from the timer itself, not from this
+	 * CPU.
+	 */
+	tlat = container_of(timer, struct timerlat_variables, timer);
+
+	now = ktime_to_ns(hrtimer_cb_get_time(&tlat->timer));
+
+	/*
+	 * Enable the osnoise: events for thread an softirq.
+	 */
+	tlat->tracing_thread = true;
+
+	osn_var->thread.arrival_time = time_get();
+
+	/*
+	 * A hardirq is running: the timer IRQ. It is for sure preempting
+	 * a thread, and potentially preempting a softirq.
+	 *
+	 * At this point, it is not interesting to know the duration of the
+	 * preempted thread (and maybe softirq), but how much time they will
+	 * delay the beginning of the execution of the timer thread.
+	 *
+	 * To get the correct (net) delay added by the softirq, its delta_start
+	 * is set as the IRQ one. In this way, at the return of the IRQ, the delta
+	 * start of the sofitrq will be zeroed, accounting then only the time
+	 * after that.
+	 *
+	 * The thread follows the same principle. However, if a softirq is
+	 * running, the thread needs to receive the softirq delta_start. The
+	 * reason being is that the softirq will be the last to be unfolded,
+	 * resseting the thread delay to zero.
+	 */
+#ifndef CONFIG_PREEMPT_RT
+	if (osn_var->softirq.delta_start) {
+		copy_int_safe_time(osn_var, &osn_var->thread.delta_start,
+				   &osn_var->softirq.delta_start);  
Isn't softirq.delta_start going to be zero here? It doesn't look to get
updated until you set tracing_thread to true, but that happens here, and as
this is in a interrupt context, there will not be a softirq happening
between the setting of that to true to this point.  
No... on the timerlat, the "sampling" is always on. And the
osnoise_data.timerlat_tracer is only checked at the softirq return, so the
softirq entry always set set the delta_start.
OK, I was confused by the timerlat using the "__osnoise_tracer_start()". If
timerlat is going to use that, perhaps we need to rename it, because the
"osnoise" is one tracer, and its confusing that the "timerlat" is using
functions called "*_osnoise_*". I was thinking that those functions were
only for the osnoise tracer and not part of the timerlat tracer, and
ignored them when looking at what the timerlat tracer was doing.

Can we rename that to simply "start_latency_tracing()" or something more
generic.
quoted
quoted
+
+		copy_int_safe_time(osn_var, &osn_var->softirq.delta_start,
+				    &osn_var->irq.delta_start);
+	} else {
+		copy_int_safe_time(osn_var, &osn_var->thread.delta_start,
+				    &osn_var->irq.delta_start);
+	}
+#else /* CONFIG_PREEMPT_RT */
+	/*
+	 * The sofirqs run as threads on RT, so there is not need
+	 * to keep track of it.
+	 */
+	copy_int_safe_time(osn_var, &osn_var->thread.delta_start, &osn_var->irq.delta_start);
+#endif /* CONFIG_PREEMPT_RT */
+
+	/*
+	 * Compute the current time with the expected time.
+	 */
+	diff = now - tlat->abs_period;
+
+	tlat->count++;
+	s.seqnum = tlat->count;
+	s.timer_latency = diff;
+	s.context = IRQ_CONTEXT;
+
+	trace_timerlat_sample(&s);
+
+	/* Keep a running maximum ever recorded os noise "latency" */
+	if (diff > tr->max_latency) {
+		tr->max_latency = diff;
+		latency_fsnotify(tr);
+	}
+
+	if (osnoise_data.stop_tracing_in)
+		if (time_to_us(diff) >= osnoise_data.stop_tracing_in)
+			osnoise_stop_tracing();
+
+	wake_up_process(tlat->kthread);
+
+#ifdef CONFIG_STACKTRACE
+	if (osnoise_data.print_stack)
+		timerlat_save_stack(0);
+#endif  
No need for the #ifdef above. timerlat_save_stack() is defined as a nop
when not enabled, and the compiler will just optimize this out.  
The osnoise_data.print_stack is ifdefed, should I remove it from ifdef?
Well, the above ifdef is for STACKTRACE not for TIMERLAT_TRACER, which
encompasses all of this. And the "timerlat_save_stack()" is a nop when
STACKTRACE is not defined. So no.

-- Steve
quoted
  
quoted
+
+	return HRTIMER_NORESTART;
+}
+
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help