Re: [PATCH v2 3/3] sched/fair: break out of newidle balancing if an RT task appears
From: Mike Galbraith <hidden>
Date: 2021-05-15 07:30:46
Also in:
lkml
On Fri, 2021-05-07 at 13:03 +0200, Peter Zijlstra wrote:
newidle_balance() already has 'stop, you're spending too much time' controls; you've failed to explain how those are falling short and why they cannot be improved.
Greetings,
I bet a nickle he's meeting contention on a more painful scale than the
2212.429350 sample from my little i4790 desktop box, as master.today+rt
does.. nothing remotely RT like, just plays youtube as I instead watch
trace output.
homer:..debug/tracing # tail -20 trace
MediaPl~back #1-17837 [003] d...2.. 2212.292335: load_balance: NEWIDLE rq:2 lock acquired
Gecko_IOThread-4001 [000] d...2.. 2212.315030: load_balance: NEWIDLE rq:3 lock acquired
firefox-3980 [007] d...1.. 2212.315030: load_balance: NEWIDLE rq:3 locked - aborting
Timer-4089 [007] d...2.. 2212.317260: load_balance: NEWIDLE rq:0 lock acquired
Timer-4089 [007] d...2.. 2212.317319: load_balance: NEWIDLE rq:0 lock acquired
ksoftirqd/6-51 [006] d...2.. 2212.317358: load_balance: NEWIDLE rq:0 lock acquired
Timer-4089 [007] d...2.. 2212.317474: load_balance: NEWIDLE rq:0 lock acquired
MediaPl~back #2-17839 [002] d...1.. 2212.345438: load_balance: NEWIDLE rq:3 locked - aborting
Chrome_~dThread-16830 [006] d...2.. 2212.345438: load_balance: NEWIDLE rq:3 lock acquired
AudioIP~ent RPC-17665 [003] d...2.. 2212.404999: load_balance: NEWIDLE rq:5 lock acquired
ImageBridgeChld-16855 [001] d...2.. 2212.429350: load_balance: NEWIDLE rq:6 lock acquired
MediaPl~back #1-17837 [000] d...1.. 2212.429351: load_balance: NEWIDLE rq:6 locked - aborting
Chrome_~dThread-16830 [002] d...1.. 2212.429356: load_balance: NEWIDLE rq:6 locked - aborting
X-2157 [003] d...2.. 2212.461351: load_balance: NEWIDLE rq:6 lock acquired
<...>-4043 [006] d...2.. 2212.480451: load_balance: NEWIDLE rq:2 lock acquired
ksoftirqd/0-12 [000] d...2.. 2212.505545: load_balance: NEWIDLE rq:3 lock acquired
ksoftirqd/0-12 [000] d...2.. 2212.505550: load_balance: NEWIDLE rq:3 lock acquired
threaded-ml-4399 [001] d...2.. 2212.517943: load_balance: NEWIDLE rq:7 lock acquired
pulseaudio-1917 [002] d...2.. 2212.575245: load_balance: NEWIDLE rq:6 lock acquired
IPDL Background-4028 [004] d...2.. 2212.581085: load_balance: NEWIDLE rq:5 lock acquired
homer:..debug/tracing #
homer:..debug/tracing # cat trace | grep lock | wc -l
218165
homer:..debug/tracing # cat trace | grep abort | wc -l
40081
I still carry the below in my local RT trees, to which I added those
trace_printk()s. It was born some years ago on a 64 core box.
---
kernel/sched/fair.c | 11 +++++++++++
kernel/sched/features.h | 4 ++++
kernel/sched/sched.h | 10 ++++++++++
3 files changed, 25 insertions(+)
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c@@ -7200,6 +7200,12 @@ done: __maybe_unused; if (!rf) return NULL; +#ifdef CONFIG_PREEMPT_RT + /* RT tasks have better things to do than load balancing. */ + if (prev && prev->sched_class != &fair_sched_class) + return NULL; +#endif + new_tasks = newidle_balance(rq, rf); /*
@@ -9669,7 +9675,12 @@ static int load_balance(int this_cpu, st env.loop_max = min(sysctl_sched_nr_migrate, busiest->nr_running); more_balance: +#ifdef CONFIG_PREEMPT_RT + if (!rq_lock_trylock_irqsave(busiest, &rf)) + goto out; +#else rq_lock_irqsave(busiest, &rf); +#endif update_rq_clock(busiest); /* --- a/kernel/sched/features.h +++ b/kernel/sched/features.h
@@ -86,7 +86,11 @@ SCHED_FEAT(RT_PUSH_IPI, true) #endif SCHED_FEAT(RT_RUNTIME_SHARE, false) +#ifndef CONFIG_PREEMPT_RT SCHED_FEAT(LB_MIN, false) +#else +SCHED_FEAT(LB_MIN, true) +#endif SCHED_FEAT(ATTACH_AGE_LOAD, true) SCHED_FEAT(WA_IDLE, true) --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h
@@ -1322,6 +1322,16 @@ rq_lock_irqsave(struct rq *rq, struct rq rq_pin_lock(rq, rf); } +static inline int +rq_lock_trylock_irqsave(struct rq *rq, struct rq_flags *rf) + __acquires(rq->lock) +{ + if (!raw_spin_trylock_irqsave(&rq->lock, rf->flags)) + return 0; + rq_pin_lock(rq, rf); + return 1; +} + static inline void rq_lock_irq(struct rq *rq, struct rq_flags *rf) __acquires(rq->lock)