Thread (29 messages) 29 messages, 6 authors, 2021-09-22

Re: [PATCH 1/5] mm/vmscan: Throttle reclaim until some writeback completes if congested

From: Mel Gorman <hidden>
Date: 2021-09-22 08:03:56
Also in: linux-mm, lkml

On Wed, Sep 22, 2021 at 04:04:47PM +1000, Dave Chinner wrote:
On Tue, Sep 21, 2021 at 11:58:31AM +0100, Mel Gorman wrote:
quoted
On Tue, Sep 21, 2021 at 10:13:17AM +1000, NeilBrown wrote:
quoted
On Mon, 20 Sep 2021, Mel Gorman wrote:
quoted
-long wait_iff_congested(int sync, long timeout)
-{
-	long ret;
-	unsigned long start = jiffies;
-	DEFINE_WAIT(wait);
-	wait_queue_head_t *wqh = &congestion_wqh[sync];
-
-	/*
-	 * If there is no congestion, yield if necessary instead
-	 * of sleeping on the congestion queue
-	 */
-	if (atomic_read(&nr_wb_congested[sync]) == 0) {
-		cond_resched();
-
-		/* In case we scheduled, work out time remaining */
-		ret = timeout - (jiffies - start);
-		if (ret < 0)
-			ret = 0;
-
-		goto out;
-	}
-
-	/* Sleep until uncongested or a write happens */
-	prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
Uninterruptible wait.

....
quoted
+static void
+reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason,
+							long timeout)
+{
+	wait_queue_head_t *wqh = &pgdat->reclaim_wait;
+	unsigned long start = jiffies;
+	long ret;
+	DEFINE_WAIT(wait);
+
+	atomic_inc(&pgdat->nr_reclaim_throttled);
+	WRITE_ONCE(pgdat->nr_reclaim_start,
+		 node_page_state(pgdat, NR_THROTTLED_WRITTEN));
+
+	prepare_to_wait(wqh, &wait, TASK_INTERRUPTIBLE);
Interruptible wait.

Why the change?  I think these waits really need to be TASK_UNINTERRUPTIBLE.
Because from mm/ context, I saw no reason why the task *should* be
uninterruptible. It's waiting on other tasks to complete IO and it is not
protecting device state, filesystem state or anything else. If it gets
a signal, it's safe to wake up, particularly if that signal is KILL and
the context is a direct reclaimer.
I disagree. whether the sleep should be interruptable or
not is entirely dependent on whether the caller can handle failure
or not. If this is GFP_NOFAIL, allocation must not fail no matter
what the context is, so signals and the like are irrelevant.

For a context that can handle allocation failure, then it makes
sense to wake on events that will result in the allocation failing
immediately. But if all this does is make the allocation code go
around another retry loop sooner, then an interruptible sleep still
doesn't make any sense at all here...
Ok, between this and Neil's mail on the same topic, I'm convinced.

-- 
Mel Gorman
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help