Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule
From: Mike Galbraith <hidden>
Date: 2016-03-29 04:05:09
Also in:
lkml
On Fri, 2016-03-25 at 17:24 +0100, Mike Galbraith wrote:
On Fri, 2016-03-25 at 10:13 +0100, Mike Galbraith wrote:quoted
On Fri, 2016-03-25 at 09:52 +0100, Thomas Gleixner wrote:quoted
On Fri, 25 Mar 2016, Mike Galbraith wrote:quoted
On Thu, 2016-03-24 at 12:06 +0100, Mike Galbraith wrote:quoted
On Thu, 2016-03-24 at 11:44 +0100, Thomas Gleixner wrote:quoted
quoted
On the bright side, with the busted migrate enable business reverted, plus one dinky change from me [1], master-rt.today has completed 100 iterations of Steven's hotplug stress script along side endless futexstress, and is happily doing another 900 as I write this, so the next -rt should finally be hotplug deadlock free. Thomas's state machinery seems to work wonders. 'course this being hotplug, the other shoe will likely apply itself to my backside soon.That's a given :)blk-mq applied it shortly after I was satisfied enough to poke xmit.The other shoe is that notifiers can depend upon RCU grace periods, so when pin_current_cpu() snags rcu_sched, the hotplug game is over. blk_mq_queue_reinit_notify: /* * We need to freeze and reinit all existing queues. Freezing * involves synchronous wait for an RCU grace period and doing it * one by one may take a long time. Start freezing all queues in * one swoop and then wait for the completions so that freezing can * take place in parallel. */ list_for_each_entry(q, &all_q_list, all_q_node) blk_mq_freeze_queue_start(q); list_for_each_entry(q, &all_q_list, all_q_node) { blk_mq_freeze_queue_wait(q);Yeah, I stumbled over that already when analysing all the hotplug notifier sites. That's definitely a horrible one.quoted
Hohum (sharpens rock), next./me recommends frozen sharksWith the sharp rock below and the one I'll follow up with, master-rt on my DL980 just passed 3 hours of endless hotplug stress concurrent with endless tbench 8, stockfish and futextest. It has never survived this long with this load by a long shot.I knew it was unlikely to surrender that quickly. Oh well, on the bright side it seems to be running low on deadlocks.
The immunize rcu_sched rock did that btw. Having accidentally whacked the dump, I got to reproduce (took 30.03 hours) so I could analyze it. Hohum, notifier woes definitely require somewhat sharper rocks. I could make rcu_sched dodge the migration thread, but think I'll apply frozen shark to blk-mq instead. -Mike (a clever person would wait for Sir Thomas, remaining blissfully ignorant of the gory dragon slaying details, but whatever, premature testing and rt mole whacking may turn up something interesting, ya never know)