Re: [RFC][PATCH 00/21] tracing: Inter-event (e.g. latency) support
From: Tom Zanussi <hidden>
Date: 2017-02-10 18:49:33
Also in:
lkml
Hi Namhyung, On Fri, 2017-02-10 at 13:16 +0900, Namhyung Kim wrote:
Hi Tom, On Wed, Feb 08, 2017 at 11:24:56AM -0600, Tom Zanussi wrote:quoted
This patchset adds support for 'inter-event' quantities to the trace event subsystem. The most important example of inter-event quantities are latencies, or the time differences between two events. One of the main motivations for adding this capability is to provide a general-purpose base that existing existing tools such as the -RT latency_hist patchset can be built upon, while at the same time providing a simple way for users to track latencies (or any inter-event quantity) generically between any two events. Previous -RT latency_hist patchsets that take advantage of the trace event subsystem have been submitted, but they essentially hard-code special-case tracepoints and application logic in ways that can't be reused. It seemed to me that rather than providing a one-off patchset devoted specifically to generating the specific histograms in the latency_hist patchset, it should be possible to build the same functionality on top of a generic layer allowing users to do similar things for other non-latency_hist applications. In addition to preliminary patches that add some basic missing functionality such as a common ringbuffer-derived timestamp and dynamically-creatable tracepoints, the overall patchset is divided up into a few different areas that combine to produce the overall goal (The Documentation patch explains all the details):Looks very nice!
Thanks! ...
quoted
First, we create a synthetic event called wakeup_latency, that references 3 variables from other events: # echo 'wakeup_latency lat=sched_switch:wakeup_lat \ pid=sched_switch:woken_pid \ prio=sched_switch:woken_prio' >> \ /sys/kernel/debug/tracing/synthetic_events Next we add a trigger to sched_wakeup, which saves the value of the 'common_timestamp' when that event is hit in a variable, ts0. Note that this happens only when 'comm==cyclictest'. Also, 'common_timestamp' is a new field defined on every event (if needed - if there are no users of timestamps in a trace, timestamps won't be saved and there's no additional overhead from that). # echo 'hist:keys=pid:ts0=common_timestamp.usecs if \ comm=="cyclictest"' >> \ /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger Next, we add a trigger to sched_switch. When the pid being switched to matches the pid woken up by a previous sched_wakeup event, this event grabs the ts0 saved on that event, takes the difference between it and the current sched_switch's common_timestamp, and assigns it to a new 'wakeup_lat' variable. It also saves a couple other variables and then invokes the onmatch().trace() action which generates a new wakeup_latency event using those variables. # echo 'hist:keys=woken_pid=next_pid:woken_prio=next_prio:\ wakeup_lat=common_timestamp.usecs-ts0:onmatch().trace(wakeup_latency) \ if next_comm=="cyclictest"' >> \ /sys/kernel/debug/tracing/events/sched/sched_switch/triggerAs Masami said, I think the syntax is a bit hard to understand. Also it'd be nice to access an event field directly (i.e. not by adding a field in a hist). Maybe we can use a prefix like '$' to identify hist fields..
Yes, that's a good point, and I like the $ syntax - it makes the variables obvious to users.
How about below? # echo 'wakeup_latency \ lat=sched_switch.$wakeup_lat \ pid=sched_switch.next_pid \ prio=sched_switch.next_prio' >> \ /sys/kernel/debug/tracing/synthetic_events # echo 'hist: \ keys=pid: \ ts0=common_timestamp.usec \ if comm=="cyclictest"' >> \ /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger # echo 'hist: \ keys=next_pid: \ wakeup_lat=common_timestamp.usec-$ts0: \ onmatch(sched_wakeup).trace(wakeup_latency) \ if next_comm=="cyclictest"' >> \ /sys/kernel/debug/tracing/events/sched/sched_switch/trigger By passing an event name to 'onmatch', we can know where to find $ts0 easily IMHO.
I think that also makes a lot of sense - keeping things as explicit as possible makes it clear what's going on. I'll make these changes, modulo the comments along similar lines from Masami. Thanks for the excellent suggestions! Tom
Thanks, Namhyung