Re: A list of visual Profiler UIs for linux perf

From: Brendan Gregg <hidden>
Date: 2021-09-09 02:16:30

G'Day Stephen,

On Thu, Sep 9, 2021 at 5:13 AM Stephen Brennan
[off-list ref] wrote:

Hi Mark & Brendan,

Thanks for this thread - it's very useful.

Firefox Profiler is news to me and looks exciting. However, I can't see
any clear documentation on their part that the tool is client-side only.
I can't always put internal flamegraphs into a web form on the
assumption that they won't be uploaded somewhere. Do you know if there's
anything explicit that says data won't be shared? (unless I explicitly
upload it to create a link)

Flamescope is quite exciting as well! I can see how the time dimension
can be incredibly useful. Brendan, I had a couple questions regarding it:
1) Is the box color scale in terms of "number of samples in that time
interval"? If so, it would only really be useful for cpu-cycles or
instructions, correct? Something like the cpu-clock which tries to
regularly sample at a set frequency would just look monochrome?

Exclude idle stacks then cpu-cycles works. Most of our samples are
cpu-cycles based (only thing available in most of EC2). FlameScope
should already filter it:

app/perf/regexp.py:idle_stack =
re.compile("(cpuidle|cpu_idle|cpu_bringup_and_idle|native_safe_halt|xen_hypercall_sched_op|xen_hypercall_vcpu_op)")

I've also used it for other non-CPU events including off-CPU spans by
adapting it to sample equivalents.

2) I'm curious if you've considered directly using perf.data in
Flamescope, rather than perf.script? I've recently discovered the
"--symfs" and "--kallsyms" options for perf. By using perf buildid-list,
you can identify all DSOs, capture their symbol tables, and create a
minimal bundle of files to allow the perf.data to be read with useful
symbols on any system. Since perf.data contains more information,
usually with less disk space, I've started taking this approach to make
capturing, transferring, and analyzing larger recordings (especially
from customers) easier as well as more flexible and efficient. All the
same analysis can be done via the Python engine in perf-script, without
need to worry about text parsing.

We do gzip the perf script outputs. Just checking the README, I should
probably change 'perf script --header' to use -F to specify the
fields, to make it more future proof.

I haven't explored the buildid-list path since we have Java apps with
massive symbol tables that can be 100s of Mbytes of text, and other
binaries that use a mix of ELF symbol tables and DWARF debuginfo. I've
assumed this will be too big to include, but haven't tried yet. Maybe
it's better suited for some apps with smaller symbol tables?

Brendan

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help