Thread (21 messages) 21 messages, 8 authors, 2015-12-10

Re: [PATCH 2/2] mm/page_ref: add tracepoint to track down page reference manipulation

From: Michal Nazarewicz <hidden>
Date: 2015-11-10 16:02:50
Also in: linux-mm, lkml

On Mon, Nov 09 2015, Joonsoo Kim wrote:
CMA allocation should be guaranteed to succeed by definition, 
Uh?  That’s a peculiar statement.  Which is to say that it’s not true.
but,
unfortunately, it would be failed sometimes. It is hard to track down
the problem, because it is related to page reference manipulation and
we don't have any facility to analyze it.

This patch adds tracepoints to track down page reference manipulation.
With it, we can find exact reason of failure and can fix the problem.
Following is an example of tracepoint output.

<...>-9018  [004]    92.678375: page_ref_set:         pfn=0x17ac9 flags=0x0 count=1 mapcount=0 mapping=(nil) mt=4 val=1
<...>-9018  [004]    92.678378: kernel_stack:
 => get_page_from_freelist (ffffffff81176659)
 => __alloc_pages_nodemask (ffffffff81176d22)
 => alloc_pages_vma (ffffffff811bf675)
 => handle_mm_fault (ffffffff8119e693)
 => __do_page_fault (ffffffff810631ea)
 => trace_do_page_fault (ffffffff81063543)
 => do_async_page_fault (ffffffff8105c40a)
 => async_page_fault (ffffffff817581d8)
[snip]
<...>-9018  [004]    92.678379: page_ref_mod:         pfn=0x17ac9 flags=0x40048 count=2 mapcount=1 mapping=0xffff880015a78dc1 mt=4 val=1
[snip]
...
...
<...>-9131  [001]    93.174468: test_pages_isolated:  start_pfn=0x17800 end_pfn=0x17c00 fin_pfn=0x17ac9 ret=fail
[snip]
<...>-9018  [004]    93.174843: page_ref_mod_and_test: pfn=0x17ac9 flags=0x40068 count=0 mapcount=0 mapping=0xffff880015a78dc1 mt=4 val=-1 ret=1
 => release_pages (ffffffff8117c9e4)
 => free_pages_and_swap_cache (ffffffff811b0697)
 => tlb_flush_mmu_free (ffffffff81199616)
 => tlb_finish_mmu (ffffffff8119a62c)
 => exit_mmap (ffffffff811a53f7)
 => mmput (ffffffff81073f47)
 => do_exit (ffffffff810794e9)
 => do_group_exit (ffffffff81079def)
 => SyS_exit_group (ffffffff81079e74)
 => entry_SYSCALL_64_fastpath (ffffffff817560b6)

This output shows that problem comes from exit path. In exit path,
to improve performance, pages are not freed immediately. They are gathered
and processed by batch. During this process, migration cannot be possible
and CMA allocation is failed. This problem is hard to find without this
page reference tracepoint facility.

Enabling this feature bloat kernel text 20 KB in my configuration.

   text    data     bss     dec     hex filename
12041272        2223424 1507328 15772024         f0a978 vmlinux_disabled
12064844        2225920 1507328 15798092         f10f4c vmlinux_enabled

Signed-off-by: Joonsoo Kim <redacted>
Acked-by: Michal Nazarewicz <redacted>
---
 include/trace/events/page_ref.h | 128 ++++++++++++++++++++++++++++++++++++++++
I haven’t really looked at the above file though.

-- 
Best regards,                                            _     _
.o. | Liege of Serenely Enlightened Majesty of         o' \,=./ `o
..o | Computer Science,  ミハウ “mina86” ナザレヴイツ  (o o)
ooo +--[off-list ref]--<xmpp:mina86-/eSpBmjxGS4dnm+yROfE0A@public.gmane.org>-----ooO--(_)--Ooo--
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help