Thread (49 messages) 49 messages, 10 authors, 2018-07-19

Re: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries

From: Michal Hocko <mhocko@kernel.org>
Date: 2018-07-09 08:19:20
Also in: linux-doc, linux-fsdevel, lkml

On Fri 06-07-18 15:32:45, Waiman Long wrote:
[...]
A rogue application can potentially create a large number of negative
dentries in the system consuming most of the memory available if it
is not under the direct control of a memory controller that enforce
kernel memory limit.
How does this differ from other untracked allocations for untrusted
tasks in general? E.g. nothing really prevents a task to create a long
chain of unreclaimable dentries and even go to OOM potentially. Negative
dentries should be easily reclaimable on the other hand. So why does the
later needs a special treatment while the first one is ok? There are
quite some resources which allow a non privileged user to consume a lot
of memory and the memory controller is the only reliable way to mitigate
the risk.
This patchset introduces changes to the dcache subsystem to track and
optionally limit the number of negative dentries allowed to be created by
background pruning of excess negative dentries or even kill it after use.
This capability will help to limit the amount of memory that can be
consumed by negative dentries.
How are you going to balance that between workload? What prevents a
rogue application to simply consume the limit and force all others in
the system to go slow path?
Patch 1 tracks the number of negative dentries present in the LRU
lists and reports it in /proc/sys/fs/dentry-state.
If anything I _think_ vmstat would benefit from this because behavior of
the memory reclaim does depend on the amount of neg. dentries.
Patch 2 adds a "neg-dentry-pc" sysctl parameter that can be used to to
specify a soft limit on the number of negative allowed as a percentage
of total system memory. This parameter is 0 by default which means no
negative dentry limiting will be performed.
percentage has turned out to be a really wrong unit for many tunables
over time. Even 1% can be just too much on really large machines.
Patch 3 enables automatic pruning of least recently used negative
dentries when the total number is close to the preset limit.
Please explain why this cannot be done in a standard dcache shrinking
way. I strongly suspect that you are developing yet another reclaim with
its own sets of tunable and bypassing the existing infrastructure. I
haven't read patches yet but the cover letter doesn't really explain
design much so I am only guessing.
-- 
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help