Re: [PATCH 00/10] OOM Debug print selection and additional information

From: Qian Cai <hidden>
Date: 2019-08-28 01:32:46
Also in: lkml

On Aug 27, 2019, at 9:13 PM, Edward Chron [off-list ref] wrote:

On Tue, Aug 27, 2019 at 5:50 PM Qian Cai [off-list ref] wrote:

quoted

On Aug 27, 2019, at 8:23 PM, Edward Chron [off-list ref] wrote:



On Tue, Aug 27, 2019 at 5:40 AM Qian Cai [off-list ref] wrote:
On Mon, 2019-08-26 at 12:36 -0700, Edward Chron wrote:

quoted

This patch series provides code that works as a debug option through
debugfs to provide additional controls to limit how much information
gets printed when an OOM event occurs and or optionally print additional
information about slab usage, vmalloc allocations, user process memory
usage, the number of processes / tasks and some summary information
about these tasks (number runable, i/o wait), system information
(#CPUs, Kernel Version and other useful state of the system),
ARP and ND Cache entry information.

Linux OOM can optionally provide a lot of information, what's missing?
----------------------------------------------------------------------
Linux provides a variety of detailed information when an OOM event occurs
but has limited options to control how much output is produced. The
system related information is produced unconditionally and limited per
user process information is produced as a default enabled option. The
per user process information may be disabled.

Slab usage information was recently added and is output only if slab
usage exceeds user memory usage.

Many OOM events are due to user application memory usage sometimes in
combination with the use of kernel resource usage that exceeds what is
expected memory usage. Detailed information about how memory was being
used when the event occurred may be required to identify the root cause
of the OOM event.

However, some environments are very large and printing all of the
information about processes, slabs and or vmalloc allocations may
not be feasible. For other environments printing as much information
about these as possible may be needed to root cause OOM events.

For more in-depth analysis of OOM events, people could use kdump to save a
vmcore by setting "panic_on_oom", and then use the crash utility to analysis the
vmcore which contains pretty much all the information you need.

Certainly, this is the ideal. A full system dump would give you the maximum amount of
information.

Unfortunately some environments may lack space to store the dump,

Kdump usually also support dumping to a remote target via NFS, SSH etc

quoted

let alone the time to dump the storage contents and restart the system. Some

There is also “makedumpfile” that could compress and filter unwanted memory to reduce
the vmcore size and speed up the dumping process by utilizing multi-threads.

quoted

systems can take many minutes to fully boot up, to reset and reinitialize all the
devices. So unfortunately this is not always an option, and we need an OOM Report.

I am not sure how the system needs some minutes to reboot would be relevant  for the
discussion here. The idea is to save a vmcore and it can be analyzed offline even on
another system as long as it having a matching “vmlinux.".

If selecting a dump on an OOM event doesn't reboot the system and if
it runs fast enough such
that it doesn't slow processing enough to appreciably effect the
system's responsiveness then
then it would be ideal solution. For some it would be over kill but
since it is an option it is a
choice to consider or not.

It sounds like you are looking for more of this,

https://github.com/iovisor/bcc/blob/master/tools/oomkill.py

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help