Re: A list of visual Profiler UIs for linux perf
From: Brendan Gregg <hidden>
Date: 2021-09-09 02:16:30
G'Day Stephen, On Thu, Sep 9, 2021 at 5:13 AM Stephen Brennan [off-list ref] wrote:
Hi Mark & Brendan, Thanks for this thread - it's very useful. Firefox Profiler is news to me and looks exciting. However, I can't see any clear documentation on their part that the tool is client-side only. I can't always put internal flamegraphs into a web form on the assumption that they won't be uploaded somewhere. Do you know if there's anything explicit that says data won't be shared? (unless I explicitly upload it to create a link) Flamescope is quite exciting as well! I can see how the time dimension can be incredibly useful. Brendan, I had a couple questions regarding it: 1) Is the box color scale in terms of "number of samples in that time interval"? If so, it would only really be useful for cpu-cycles or instructions, correct? Something like the cpu-clock which tries to regularly sample at a set frequency would just look monochrome?
Exclude idle stacks then cpu-cycles works. Most of our samples are
cpu-cycles based (only thing available in most of EC2). FlameScope
should already filter it:
app/perf/regexp.py:idle_stack =
re.compile("(cpuidle|cpu_idle|cpu_bringup_and_idle|native_safe_halt|xen_hypercall_sched_op|xen_hypercall_vcpu_op)")
I've also used it for other non-CPU events including off-CPU spans by
adapting it to sample equivalents.
2) I'm curious if you've considered directly using perf.data in Flamescope, rather than perf.script? I've recently discovered the "--symfs" and "--kallsyms" options for perf. By using perf buildid-list, you can identify all DSOs, capture their symbol tables, and create a minimal bundle of files to allow the perf.data to be read with useful symbols on any system. Since perf.data contains more information, usually with less disk space, I've started taking this approach to make capturing, transferring, and analyzing larger recordings (especially from customers) easier as well as more flexible and efficient. All the same analysis can be done via the Python engine in perf-script, without need to worry about text parsing.
We do gzip the perf script outputs. Just checking the README, I should probably change 'perf script --header' to use -F to specify the fields, to make it more future proof. I haven't explored the buildid-list path since we have Java apps with massive symbol tables that can be 100s of Mbytes of text, and other binaries that use a mix of ELF symbol tables and DWARF debuginfo. I've assumed this will be too big to include, but haven't tried yet. Maybe it's better suited for some apps with smaller symbol tables? Brendan