Re: yielding while running SCHED_DEADLINE
From: Peter Zijlstra <peterz@infradead.org>
Date: 2018-09-17 17:09:20
On Mon, Sep 17, 2018 at 11:26:48AM +0200, Juri Lelli wrote:
Hi, On 14/09/18 23:13, Patel, Vedang wrote:quoted
Hi all, We have been playing around with SCHED_DEADLINE and found some discrepancy around the calculation of nr_involuntary_switches and nr_voluntary_switches in /proc/${PID}/sched. Whenever the task is done with it's work earlier and executes sched_yield() to voluntarily gives up the CPU this increments nr_involuntary_switches. It should have incremented nr_voluntary_switches.Mmm, I see what you are saying. [...]quoted
Looking at __schedule() in kernel/sched/core.c, the switch is counted as part of nr_involuntary_switches if the task has not been preempted and the task is TASK_RUNNING state. This does not seem to happen when sched_yield() is called.Mmm, - nr_voluntary_switches++ if !preempt && !RUNNING - nr_involuntary_switches++ otherwise (yield fits this as the task is still RUNNING, even though throttled for DEADLINE) Not sure this is the same as what you say above..quoted
Is there something we are missing over here? OR Is this a known issue and is planned to be fixed later?.. however, not sure. Peter, what you say. It looks like we might indeed want to account yield as a voluntary switch, seems to fit. In this case I guess we could use a flag or add a sched_ bit to task_struct to handle the case?
It's been like this _forever_ afaict. This isn't deadline specific afaict, all yield callers will end up in non-voluntary switches. I don't know anybody that cares and I don't think this is something worth fixing. If someone did rely on this behaviour we'd break them, and i'd much rather save a cycle than add more stupid stats crap to the scheduler.