Re: [PATCH v2] livepatch: Fix idle cpu's tasks transition
From: Petr Mladek <pmladek@suse.com>
Date: 2021-09-20 08:28:02
Also in:
lkml
On Wed 2021-09-15 16:18:01, Vasily Gorbik wrote:
On an idle system with large amount of cpus it might happen that klp_update_patch_state() is not reached in do_idle() for a long periods of time. With debug messages enabled log is filled with: [ 499.442643] livepatch: klp_try_switch_task: swapper/63:0 is running without any signs of progress. Ending up with "failed to complete transition". On s390 LPAR with 128 cpus not a single transition is able to complete and livepatch kselftests fail. Tests on idling x86 kvm instance with 128 cpus demonstrate similar symptoms with and without CONFIG_NO_HZ. To deal with that, since runqueue is already locked in klp_try_switch_task() identify idling cpus and trigger rescheduling potentially waking them up and making sure idle tasks break out of do_idle() inner loop and reach klp_update_patch_state(). This helps to speed up transition time while avoiding unnecessary extra system load. Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> --- Ingo/Peter, as Josh mentioned, could you please ack if you are ok with livepatch calling this private scheduler interface?
Ingo, Peter, Josh, could anyone please ack that it is acceptable to call resched_curr(rq) from the livepatch code? Or is there a better way to make an idle task go through the main cycle? Best Regards, Petr
quoted hunk ↗ jump to hunk
Changes since v1: - added comments suggested by Petr lkml.kernel.org/r/patch.git-a4aad6b1540d.your-ad-here.call-01631177886-ext-3083@work.hours Previous discussion and RFC PATCH: lkml.kernel.org/r/patch.git-b76842ceb035.your-ad-here.call-01625661932-ext-1304@work.hours kernel/livepatch/transition.c | 8 ++++++++ 1 file changed, 8 insertions(+)diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c index 291b857a6e20..2846a879f2dc 100644 --- a/kernel/livepatch/transition.c +++ b/kernel/livepatch/transition.c@@ -278,6 +278,8 @@ static int klp_check_stack(struct task_struct *task, char *err_buf) * Try to safely switch a task to the target patch state. If it's currently * running, or it's sleeping on a to-be-patched or to-be-unpatched function, or * if the stack is unreliable, return false. + * + * Idle tasks are switched in the main loop when running. */ static bool klp_try_switch_task(struct task_struct *task) {@@ -308,6 +310,12 @@ static bool klp_try_switch_task(struct task_struct *task) rq = task_rq_lock(task, &flags); if (task_running(rq, task) && task != current) { + /* + * Idle task might stay running for a long time. Switch them + * in the main loop. + */ + if (is_idle_task(task)) + resched_curr(rq); snprintf(err_buf, STACK_ERR_BUF_SIZE, "%s: %s:%d is running\n", __func__, task->comm, task->pid);-- 2.25.4