Re: Worst case performance of up()
From: Adrian Cox <hidden>
Date: 2006-12-02 10:36:00
On Mon, 2006-11-27 at 21:02 +0000, Adrian Cox wrote:
On Sat, 2006-11-25 at 07:45 +1100, Benjamin Herrenschmidt wrote:quoted
On Fri, 2006-11-24 at 16:21 +0000, Adrian Cox wrote:quoted
Does anybody have any ideas what could make up() take so long in this circumstance? I'd expect cache transfers to make the operation about 100 times slower, but this looks like repeated cache ping-pong between the two CPUs.Is it hung in up() (toplevel) or __up (low level) ?Not yet proven.
By using a scope, I have further data: the system is hung in this line of resched_task() in kernel/sched.c: set_tsk_thread_flag(p, TIF_NEED_RESCHED); During this time, there is a great deal of ARTRY activity on the bus. The sequence ends when the other CPU takes a timer tick. I'll need to track down what the other CPU is doing at this point, but my current hypothesis is that it's somewhere in schedule().
quoted
Have you tried some oprofile runs to catch the exact instruction where the cycles appear to be wasted ?
Oprofile turned out to break the error condition, by increasing the interrupt rate on each CPU. In the end a combination of lockmeter and an oscilloscope did the trick. -- Adrian Cox [off-list ref]