Re: [PATCH 2/6] MM: annotate congestion_wait() and wait_iff_congested() as ineffective.
From: NeilBrown <hidden>
Date: 2021-09-16 22:13:30
Also in:
linux-ext4, linux-fsdevel, linux-nfs, linux-xfs, lkml
On Wed, 15 Sep 2021, Michal Hocko wrote:
On Tue 14-09-21 10:13:04, Neil Brown wrote:quoted
Only 4 subsystems call set_bdi_congested() or clear_bdi_congested(): block/pktcdvd, fs/ceph fs/fuse fs/nfs It may make sense to use congestion_wait() or wait_iff_congested() within these subsystems, but they have no value outside of these. Add documentation comments to these functions to discourage further use.This is an unfortunate state. The MM layer still relies on the API. While adding a documentation to clarify the current status can stop more usage I am wondering what is a real alternative. My experience tells me that a lack of real alternative will lead to new creative ways of doing things instead.
That is a valid concern. Discouraging the use of an interface without providing a clear alternative risks people doing worse things. At lease if people continue to use congestion_wait(), then we will be able to find those uses when we are able to provide a better approach. I'll drop this patch. Thanks, NeilBrown
quoted
Signed-off-by: NeilBrown <redacted> --- include/linux/backing-dev.h | 7 +++++++ mm/backing-dev.c | 9 +++++++++ 2 files changed, 16 insertions(+)diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index ac7f231b8825..cc9513840351 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h@@ -153,6 +153,13 @@ static inline int wb_congested(struct bdi_writeback *wb, int cong_bits) return wb->congested & cong_bits; } +/* NOTE congestion_wait() and wait_iff_congested() are + * largely useless except as documentation. + * congestion_wait() will (almost) always wait for the given timeout. + * wait_iff_congested() will (almost) never wait, but will call + * cond_resched(). + * Were possible an alternative waiting strategy should be found. + */ long congestion_wait(int sync, long timeout); long wait_iff_congested(int sync, long timeout);diff --git a/mm/backing-dev.c b/mm/backing-dev.c index 4a9d4e27d0d9..53472ab38796 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c@@ -1023,6 +1023,11 @@ EXPORT_SYMBOL(set_bdi_congested); * Waits for up to @timeout jiffies for a backing_dev (any backing_dev) to exit * write congestion. If no backing_devs are congested then just wait for the * next write to be completed. + * + * NOTE: in the current implementation, hardly any backing_devs are ever + * marked as congested, and write-completion is rarely reported (see calls + * to clear_bdi_congested). So this should not be assumed to ever wake before + * the timeout. */ long congestion_wait(int sync, long timeout) {@@ -1054,6 +1059,10 @@ EXPORT_SYMBOL(congestion_wait); * The return value is 0 if the sleep is for the full timeout. Otherwise, * it is the number of jiffies that were still remaining when the function * returned. return_value == timeout implies the function did not sleep. + * + * NOTE: in the current implementation, hardly any backing_devs are ever + * marked as congested, and write-completion is rarely reported (see calls + * to clear_bdi_congested). So this should not be assumed to sleep at all. */ long wait_iff_congested(int sync, long timeout) {-- Michal Hocko SUSE Labs