Re: [PATCH 1/5] mm/vmscan: Throttle reclaim until some writeback completes if congested
From: Mel Gorman <hidden>
Date: 2021-09-22 08:03:56
Also in:
linux-mm, lkml
On Wed, Sep 22, 2021 at 04:04:47PM +1000, Dave Chinner wrote:
On Tue, Sep 21, 2021 at 11:58:31AM +0100, Mel Gorman wrote:quoted
On Tue, Sep 21, 2021 at 10:13:17AM +1000, NeilBrown wrote:quoted
On Mon, 20 Sep 2021, Mel Gorman wrote:quoted
-long wait_iff_congested(int sync, long timeout) -{ - long ret; - unsigned long start = jiffies; - DEFINE_WAIT(wait); - wait_queue_head_t *wqh = &congestion_wqh[sync]; - - /* - * If there is no congestion, yield if necessary instead - * of sleeping on the congestion queue - */ - if (atomic_read(&nr_wb_congested[sync]) == 0) { - cond_resched(); - - /* In case we scheduled, work out time remaining */ - ret = timeout - (jiffies - start); - if (ret < 0) - ret = 0; - - goto out; - } - - /* Sleep until uncongested or a write happens */ - prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);Uninterruptible wait. ....quoted
+static void +reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, + long timeout) +{ + wait_queue_head_t *wqh = &pgdat->reclaim_wait; + unsigned long start = jiffies; + long ret; + DEFINE_WAIT(wait); + + atomic_inc(&pgdat->nr_reclaim_throttled); + WRITE_ONCE(pgdat->nr_reclaim_start, + node_page_state(pgdat, NR_THROTTLED_WRITTEN)); + + prepare_to_wait(wqh, &wait, TASK_INTERRUPTIBLE);Interruptible wait. Why the change? I think these waits really need to be TASK_UNINTERRUPTIBLE.Because from mm/ context, I saw no reason why the task *should* be uninterruptible. It's waiting on other tasks to complete IO and it is not protecting device state, filesystem state or anything else. If it gets a signal, it's safe to wake up, particularly if that signal is KILL and the context is a direct reclaimer.I disagree. whether the sleep should be interruptable or not is entirely dependent on whether the caller can handle failure or not. If this is GFP_NOFAIL, allocation must not fail no matter what the context is, so signals and the like are irrelevant. For a context that can handle allocation failure, then it makes sense to wake on events that will result in the allocation failing immediately. But if all this does is make the allocation code go around another retry loop sooner, then an interruptible sleep still doesn't make any sense at all here...
Ok, between this and Neil's mail on the same topic, I'm convinced. -- Mel Gorman SUSE Labs