Re: [RFC][PATCH RT 0/4] sched/rt: Lower rq lock contention latencies on many CPU boxes
From: Clark Williams <hidden>
Date: 2012-12-10 22:59:44
Also in:
lkml
On Fri, 07 Dec 2012 18:56:15 -0500 Steven Rostedt [off-list ref] wrote:
I've been debugging large latencies on a 40 core box and found a major cause due to the thundering herd like grab of the rq lock due to the pull_rt_task() logic. Basically, if a large number of CPUs were to lower its priority roughly the same time, they would all trigger a pull. If there happens to be only one CPU available to get a task, all CPUs doing the pull will try to grab it. In doing so, they will all contend on the rq lock of the overloaded CPU. Only one CPU will succeed in pulling the task and unfortunately, there's no quick way to know which, as it's dependent on the affinitiy of the task that needs to be pulled, and to look at that, we need to grab its rq lock! Instead of having the pull logic grab the rq locks and do the work to switch the task over to the pulling CPU, this patch series (well patch #3) has the pulling CPU send an IPI to the overloaded CPU and that CPU will do the push instead. The push logic uses the cpupri.c code to quickly find the best CPU to offload the overloaded RT task to, so it makes it quite efficient to do this. Retrieving multiple IPIs has a much lower overhead than all the CPUs grabbing the rq lock. The other three patches are fixes/enhancements to the push/pull code that I found while doing the debugging of the latencies. Note, although this patch series is made for the -rt patch, the issues apply to mainline as well. But because -rt has the migrate_disable() code, this patch series is tailored to that. But if we can vet this out in -rt, all this code should make its way quickly to mainline. I tested this code out, but it probably needs some clean up and definitely more comments. I'm only posting this as an RFC for now to get feedback on the idea. Thanks!
Steve, I've been running this set of patches on my laptop+RT kernel since Friday with no ill-effects. I just applied it to v3.6.10+rt21 and it seems to be fine. I've got rteval runs going on a 40-core and a 24-core box which will be done early Tuesday morning so I'll let you know results then. Clark
Attachments
- signature.asc [application/pgp-signature] 490 bytes