Thread (80 messages) 80 messages, 12 authors, 2021-10-15

Re: [PATCH v10 3/3] mm: add anonymous vma name refcounting

From: Suren Baghdasaryan <surenb@google.com>
Date: 2021-10-12 05:36:39
Also in: linux-fsdevel, linux-mm, lkml

On Mon, Oct 11, 2021 at 8:00 PM Johannes Weiner [off-list ref] wrote:
On Mon, Oct 11, 2021 at 06:20:25PM -0700, Suren Baghdasaryan wrote:
quoted
On Mon, Oct 11, 2021 at 6:18 PM Suren Baghdasaryan [off-list ref] wrote:
quoted
On Mon, Oct 11, 2021 at 1:36 AM Michal Hocko [off-list ref] wrote:
quoted
On Fri 08-10-21 13:58:01, Kees Cook wrote:
quoted
- Strings for "anon" specifically have no required format (this is good)
  it's informational like the task_struct::comm and can (roughly)
  anything. There's no naming convention for memfds, AF_UNIX, etc. Why
  is one needed here? That seems like a completely unreasonable
  requirement.
I might be misreading the justification for the feature. Patch 2 is
talking about tools that need to understand memeory usage to make
further actions. Also Suren was suggesting "numbering convetion" as an
argument against.

So can we get a clear example how is this being used actually? If this
is just to be used to debug by humans than I can see an argument for
human readable form. If this is, however, meant to be used by tools to
make some actions then the argument for strings is much weaker.
The simplest usecase is when we notice that a process consumes more
memory than usual and we do "cat /proc/$(pidof my_process)/maps" to
check which area is contributing to this growth. The names we assign
to anonymous areas are descriptive enough for a developer to get an
idea where the increased consumption is coming from and how to proceed
with their investigation.
There are of course cases when tools are involved, but the end-user is
always a human and the final report should contain easily
understandable data.

IIUC, the main argument here is whether the userspace can provide
tools to perform the translations between ids and names, with the
kernel accepting and reporting ids instead of strings. Technically
it's possible, but to be practical that conversion should be fast
because we will need to make name->id conversion potentially for each
mmap. On the consumer side the performance is not as critical, but the
fact that instead of dumping /proc/$pid/maps we will have to parse the
file, do id->name conversion and replace all [anon:id] with
[anon:name] would be an issue when we do that in bulk, for example
when collecting system-wide data for a bugreport.
Is that something you need to do client-side? Or could the bug tool
upload the userspace-maintained name:ids database alongside the
/proc/pid/maps dump for external processing?
You can generate a bugreport and analyze it locally or submit it as an
attachment to a bug for further analyzes.
Sure, we can attach the id->name conversion table to the bugreport but
either way, some tool would have to post-process it to resolve the
ids. If we are not analyzing the results immediately then that step
can be postponed and I think that's what you mean? If so, then yes,
that is correct.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help