Re: [RFC] scripts: Include postprocessing script for memory allocation tracing
From: Janani Ravichandran <hidden>
Date: 2016-09-13 18:04:59
Also in:
lkml
On Sep 12, 2016, at 8:16 AM, Michal Hocko [off-list ref] wrote: Hi,
Hello Michal,
I am sorry I didn't follow up on the previous submission.
That’s alright :)
I find this _really_ helpful. It is great that you could build on top of existing tracepoints but one thing is not entirely clear to me. Without a begin marker in __alloc_pages_nodemask we cannot really tell how long the whole allocation took, which would be extremely useful. Or do you use any graph tracer tricks to deduce that?
I’m using the function graph tracer to see how long __alloc_pages_nodemask() took.
There is a note in your changelog but I cannot seem to find that in the changelog. And FWIW I would be open to adding a tracepoint like that. It would make our life so much easier…
The line echo __alloc_pages_nodemask > set_ftrace_filter in setup_alloc_trace.sh sets __alloc_pages_nodemask as a function graph filter and this should help us observe how long the function took.
On Sun 11-09-16 18:24:12, Janani Ravichandran wrote: [...]quoted
allocation_postprocess.py is a script which reads from trace_pipe. It does the following to filter out info from tracepoints that may not be important: 1. Displays mm_vmscan_direct_reclaim_begin and mm_vmscan_direct_reclaim_end only when try_to_free_pages has exceeded the threshold. 2. Displays mm_compaction_begin and mm_compaction_end only when compact_zone has exceeded the threshold. 3. Displays mm_compaction_try_to_compat_pages only when try_to_compact_pages has exceeded the threshold. 4. Displays mm_shrink_slab_start and mm_shrink_slab_end only when the time elapsed between them exceeds the threshold. 5. Displays mm_vmscan_lru_shrink_inactive only when shrink_inactive_list has exceeded the threshold. When CTRL+C is pressed, the script shows the times taken by the shrinkers. However, currently it is not possible to differentiate among the superblock shrinkers. Sample output: ^Ci915_gem_shrinker_scan : total time = 8.731000 ms, max latency = 0.278000 ms ext4_es_scan : total time = 0.970000 ms, max latency = 0.129000 ms scan_shadow_nodes : total time = 1.150000 ms, max latency = 0.175000 ms super_cache_scan : total time = 8.455000 ms, max latency = 0.466000 ms deferred_split_scan : total time = 25.767000 ms, max latency = 25.485000 msWould it be possible to group those per the context?
Absolutely!
I mean a single allocation/per-process drop down values rather than mixing all those values together? For example if I see that a process is talling due to direct reclaim I would love to see what is the worst case allocation stall and what is the most probable source of that stall. Mixing kswapd traces would be misleading here.
True. I’ll do that and send a v2. Thanks for the suggestions! Janani
-- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>