Re: [PATCH 00/11] fs/dcache: Limit # of negative dentries
From: Matthew Wilcox <willy@infradead.org>
Date: 2020-02-26 21:28:54
Also in:
linux-fsdevel, lkml
On Wed, Feb 26, 2020 at 02:19:59PM -0500, Waiman Long wrote:
On 2/26/20 11:29 AM, Matthew Wilcox wrote:quoted
On Wed, Feb 26, 2020 at 11:13:53AM -0500, Waiman Long wrote:quoted
A new sysctl parameter "dentry-dir-max" is introduced which accepts a value of 0 (default) for no limit or a positive integer 256 and up. Small dentry-dir-max numbers are forbidden to avoid excessive dentry count checking which can impact system performance.This is always the wrong approach. A sysctl is just a way of blaming the sysadmin for us not being very good at programming. I agree that we need a way to limit the number of negative dentries. But that limit needs to be dynamic and depend on how the system is being used, not on how some overworked sysadmin has configured it. So we need an initial estimate for the number of negative dentries that we need for good performance. Maybe it's 1000. It doesn't really matter; it's going to change dynamically. Then we need a metric to let us know whether it needs to be increased. Perhaps that's "number of new negative dentries created in the last second". And we need to decide how much to increase it; maybe it's by 50% or maybe by 10%. Perhaps somewhere between 10-100% depending on how high the recent rate of negative dentry creation has been. We also need a metric to let us know whether it needs to be decreased. I'm reluctant to say that memory pressure should be that metric because very large systems can let the number of dentries grow in an unbounded way. Perhaps that metric is "number of hits in the negative dentry cache in the last ten seconds". Again, we'll need to decide how much to shrink the target number by. If the number of negative dentries is at or above the target, then creating a new negative dentry means evicting an existing negative dentry. If the number of negative dentries is lower than the target, then we can just create a new one. Of course, memory pressure (and shrinking the target number) should cause negative dentries to be evicted from the old end of the LRU list. But memory pressure shouldn't cause us to change the target number; the target number is what we think we need to keep the system running smoothly.Thanks for the quick response. I agree that auto-tuning so that the system administrator don't have to worry about it will be the best approach if it is implemented in the right way. However, it is hard to do it right. How about letting users specify a cap on the amount of total system memory allowed for negative dentries like one of my previous patchs. Internally, there is a predefined minimum and maximum for dentry-dir-max. We sample the total negative dentry counts periodically and adjust the dentry-dir-max accordingly. Specifying a percentage of total system memory is more intuitive than just specifying a hard number for dentry-dir-max. Still some user input is required.
If you want to base the whole thing on a per-directory target number, or a percentage of the system memory target (rather than my suggestion of a total # of negative dentries), that seems reasonable. What's not reasonable is expecting the sysadmin to be able to either predict the workload, or react to a changing workload in sufficient time. The system has to be self-tuning. Just look how long stale information stays around about how to tune your Linux system. Here's an article from 2018 suggesting using the 'intr' option for NFS mounts: https://kb.netapp.com/app/answers/answer_view/a_id/1004893/~/hard-mount-vs-soft-mount- I made that a no-op in 2007. Any tunable you add to Linux immediately becomes a cargo-cult solution to any problem people are having.