Re: [PATCH] blk-cgroup: check blkcg policy is enabled in blkg_create()
From: yukuai (C) <hidden>
Date: 2021-10-12 01:09:14
Also in:
cgroups, lkml
On 2021/10/11 23:23, Michal Koutný wrote:
Hello. On Fri, Oct 08, 2021 at 03:27:20PM +0800, Yu Kuai [off-list ref] wrote:quoted
This is because blkg_alloc() is called from blkg_conf_prep() without holding 'q->queue_lock', and elevator is exited before blkg_create():IIUC the problematic interleaving is this one (I've noticed `blkg->pd[i] = NULL` to thread 2 call trace):
The new blkg will not add to blkg_list untill pd_init_fn() is done in blkg_create(), thus blkcg_deactivate_policy() can't access this blkg.
quoted
thread 1 thread 2 blkg_conf_prep spin_lock_irq(&q->queue_lock); blkg_lookup_check -> return NULL spin_unlock_irq(&q->queue_lock); blkg_alloc blkcg_policy_enabled -> true pd = ->pd_alloc_fn blk_mq_exit_sched bfq_exit_queue blkcg_deactivate_policy spin_lock_irq(&q->queue_lock); __clear_bit(pol->plid, q->blkcg_pols);pol->pd_free_fn(blkg->pd[i]); blkg->pd[i] = NULL;quoted
spin_unlock_irq(&q->queue_lock); q->elevator = NULL;blkg->pd[i] = pdquoted
spin_lock_irq(&q->queue_lock); blkg_create if (blkg->pd[i]) ->pd_init_fn -> q->elevator is NULL spin_unlock_irq(&q->queue_lock);In high-level terms, is this a race between (blk)io controller attribute write and a device scheduler (elevator) switch? If so, I'd add it to the commit message.quoted
Fix the problem by checking that policy is still enabled in blkg_create().Is this sufficient wrt some other q->elevator users later?quoted
@@ -252,6 +266,9 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg, goto err_free_blkg; }I'd add a comment here like:quoted
Re-check policies are still enabled, since the caller blkg_conf_prep() temporarily drops q->queue_lock and we can race with blk_mq_exit_sched() removing policies.
Thanks for your advice. Best regards, Kuai
quoted
+ if (new_blkg) + blkg_check_pd(q, new_blkg); +Thanks, Michal .