Re: Re: [PATCH v1] sched/numa: add per-process numa_balancing
From: Mel Gorman <mgorman@suse.de>
Date: 2021-10-29 08:38:00
Also in:
linux-api, linux-fsdevel, lkml
On Fri, Oct 29, 2021 at 02:12:28PM +0800, Gang Li wrote:
On 10/28/21 11:30 PM, Mel Gorman wrote:quoted
That aside though, the configuration space could be better. It's possible to selectively disable NUMA balance but not selectively enable because prctl is disabled if global NUMA balancing is disabled. That could be somewhat achieved by having a default value for mm->numa_balancing based on whether the global numa balancing is disabled via command line or sysctl and enabling the static branch if prctl is used with an informational message. This is not the only potential solution but as it stands, there are odd semantic corner cases. For example, explicit enabling of NUMA balancing by prctl gets silently revoked if numa balancing is disabled via sysctl and prctl(PR_NUMA_BALANCING, PR_SET_NUMA_BALANCING, 1) means nothing.static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued) { ... if (static_branch_unlikely(&sched_numa_balancing)) task_tick_numa(rq, curr); ... } static void task_tick_numa(struct rq *rq, struct task_struct *curr) { ... if (!READ_ONCE(curr->mm->numa_balancing)) return; ... } When global numa_balancing is disabled, mm->numa_balancing is useless.
I'm aware that this is the behaviour of the patch as-is.
So I think prctl(PR_NUMA_BALANCING, PR_SET_NUMA_BALANCING,0/1) should return error instead of modify mm->numa_balancing. Is it reasonable that prctl(PR_NUMA_BALANCING,PR_SET_NUMA_BALANCING,0/1) can still change the value of mm->numa_balancing when global numa_balancing is disabled?
My point is that as it stands, prctl(PR_NUMA_BALANCING,PR_SET_NUMA_BALANCING,1) either does nothing or fails. If per-process numa balancing is to be introduced, it should have meaning with the global tuning affecting default behaviour and the prctl affecting specific behaviour. -- Mel Gorman SUSE Labs