Re: [RFC] scripts: Include postprocessing script for memory allocation tracing

From: Janani Ravichandran <hidden>
Date: 2016-09-13 18:04:59
Also in: lkml

On Sep 12, 2016, at 8:16 AM, Michal Hocko [off-list ref] wrote:

Hi,

Hello Michal,

I am sorry I didn't follow up on the previous submission.

That’s alright :)

I find this
_really_ helpful. It is great that you could build on top of existing
tracepoints but one thing is not entirely clear to me. Without a begin
marker in __alloc_pages_nodemask we cannot really tell how long the
whole allocation took, which would be extremely useful. Or do you use
any graph tracer tricks to deduce that?

I’m using the function graph tracer to see how long __alloc_pages_nodemask()
took.

There is a note in your
changelog but I cannot seem to find that in the changelog. And FWIW I
would be open to adding a tracepoint like that. It would make our life
so much easier…

The line
echo __alloc_pages_nodemask > set_ftrace_filter in setup_alloc_trace.sh
sets __alloc_pages_nodemask as a function graph filter and this should help
us observe how long the function took.

On Sun 11-09-16 18:24:12, Janani Ravichandran wrote:
[...]

quoted

allocation_postprocess.py is a script which reads from trace_pipe. It
does the following to filter out info from tracepoints that may not
be important:

1. Displays mm_vmscan_direct_reclaim_begin and
mm_vmscan_direct_reclaim_end only when try_to_free_pages has
exceeded the threshold.
2. Displays mm_compaction_begin and mm_compaction_end only when
compact_zone has exceeded the threshold.
3. Displays mm_compaction_try_to_compat_pages only when
try_to_compact_pages has exceeded the threshold.
4. Displays mm_shrink_slab_start and mm_shrink_slab_end only when
the time elapsed between them exceeds the threshold.
5. Displays mm_vmscan_lru_shrink_inactive only when shrink_inactive_list
has exceeded the threshold.

When CTRL+C is pressed, the script shows the times taken by the
shrinkers. However, currently it is not possible to differentiate among
the
superblock shrinkers.

Sample output:
^Ci915_gem_shrinker_scan : total time = 8.731000 ms, max latency =
0.278000 ms
ext4_es_scan : total time = 0.970000 ms, max latency = 0.129000 ms
scan_shadow_nodes : total time = 1.150000 ms, max latency = 0.175000 ms
super_cache_scan : total time = 8.455000 ms, max latency = 0.466000 ms
deferred_split_scan : total time = 25.767000 ms, max latency = 25.485000
ms

Would it be possible to group those per the context?

Absolutely!

I mean a single
allocation/per-process drop down values rather than mixing all those
values together? For example if I see that a process is talling due to
direct reclaim I would love to see what is the worst case allocation
stall and what is the most probable source of that stall. Mixing kswapd
traces would be misleading here.

True. I’ll do that and send a v2. Thanks for the suggestions!

Janani

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help