Re: [PATCH v5 00/14] Add BLK_CTL support for i.MX8MP
From: Adam Ford <hidden>
Date: 2021-03-22 23:50:37
Also in:
linux-clk, linux-devicetree, linux-pm, lkml
On Wed, Mar 3, 2021 at 4:54 AM Marek Vasut [off-list ref] wrote:
On 3/3/21 11:47 AM, Abel Vesa wrote:quoted
On 20-11-03 13:18:12, Abel Vesa wrote:quoted
The BLK_CTL according to HW design is basically the wrapper of the entire function specific group of IPs and holds GPRs that usually cannot be placed into one specific IP from that group. Some of these GPRs are used to control some clocks, other some resets, others some very specific function that does not fit into clocks or resets. Since the clocks are registered using the i.MX clock subsystem API, the driver is placed into the clock subsystem, but it also registers the resets. For the other functionalities that other GPRs might have, the syscon is used.This approach seems to be introducing a possible ABBA deadlock due to the core clock and genpd locking. Here is a backtrace I got from Pete Zhang (he reported the issue on the internal mailing list): [ 11.667711][ T108] -> #1 (&genpd->mlock){+.+.}-{3:3}: [ 11.675041][ T108] __lock_acquire+0xae4/0xef8 [ 11.680093][ T108] lock_acquire+0xfc/0x2f8 [ 11.684888][ T108] __mutex_lock+0x90/0x870 [ 11.689685][ T108] mutex_lock_nested+0x44/0x50 [ 11.694826][ T108] genpd_lock_mtx+0x18/0x24 [ 11.699706][ T108] genpd_runtime_resume+0x90/0x214 (hold genpd->mlock) [ 11.705194][ T108] __rpm_callback+0x80/0x2c0 [ 11.710160][ T108] rpm_resume+0x468/0x650 [ 11.714866][ T108] __pm_runtime_resume+0x60/0x88 [ 11.720180][ T108] clk_pm_runtime_get+0x28/0x9c [ 11.725410][ T108] clk_disable_unused_subtree+0x8c/0x144 [ 11.731420][ T108] clk_disable_unused_subtree+0x124/0x144 [ 11.737518][ T108] clk_disable_unused+0xa4/0x11c (hold prepare_lock) [ 11.742833][ T108] do_one_initcall+0x98/0x178 [ 11.747888][ T108] do_initcall_level+0x9c/0xb8 [ 11.753028][ T108] do_initcalls+0x54/0x94 [ 11.757736][ T108] do_basic_setup+0x24/0x30 [ 11.762614][ T108] kernel_init_freeable+0x70/0xa4 [ 11.768014][ T108] kernel_init+0x14/0x18c [ 11.772722][ T108] ret_from_fork+0x10/0x18 [ 11.777512][ T108] -> #0 (prepare_lock){+.+.}-{3:3}: [ 11.784749][ T108] check_noncircular+0x134/0x13c [ 11.790064][ T108] validate_chain+0x590/0x2a04 [ 11.795204][ T108] __lock_acquire+0xae4/0xef8 [ 11.800258][ T108] lock_acquire+0xfc/0x2f8 [ 11.805050][ T108] __mutex_lock+0x90/0x870 [ 11.809841][ T108] mutex_lock_nested+0x44/0x50 [ 11.814983][ T108] clk_unprepare+0x5c/0x100 ((hold prepare_lock)) [ 11.819864][ T108] imx8m_pd_power_off+0xac/0x110 [ 11.825179][ T108] genpd_power_off+0x1b4/0x2dc [ 11.830318][ T108] genpd_power_off_work_fn+0x38/0x58 (hold genpd->mlock) [ 11.835981][ T108] process_one_work+0x270/0x444 [ 11.841208][ T108] worker_thread+0x280/0x4e4 [ 11.846176][ T108] kthread+0x13c/0x14 [ 11.850621][ T108] ret_from_fork+0x10/0x18 Now, this has been reproduced only on the NXP internal tree, but I think it is pretty obvious this could happen in upstream too, with this patchset applied. First, my thought was to change the prepare_lock/enable_lock in clock core, from a global approach to a per clock basis. But that doesn't actually fix the issue. The usecase seen above is due to clk_disable_unused, but the same could happen when a clock consumer calls prepare/unprepare on a clock. I guess the conclusion is that the current state of the clock core and genpd implementation does not support a usecase where a clock controller has a PD which in turn uses another clock (from another clock controller). Jacky, Pete, did I miss anything here ?Just for completeness, I have a feeling I already managed to trigger this and discussed this with Lucas before, so this concern is certainly valid.
I know it may not be ideal, someone tied a soft-reset and soft-enable to the driver of the Hantro VPU on the IMXMQ [1], and I wonder if something similar could be done for the drivers who are consumers of the clocks. For example: lcdif could request the power domain. The power domain soft-resets and enables bus clock (vis syscon) After successful enabling of power-domain, the LCDIF requests the soft reset and respective clock bits (also via syscon) similar to how [1] and [2] do it for the Hantro VPU. The syscon node could be a common node similar to what was done in [2], and multiple consumers could control when each soft-reset and clock-enable get activated. I know it's probably more of an abuse of the syscon system, but having the individual drivers control the order of operations might be safer than trying to create a one-size-fits-all driver. adam [1] - https://patchwork.kernel.org/project/linux-arm-kernel/patch/20210318082046.51546-4-benjamin.gaignard@collabora.com/ [2] - https://patchwork.kernel.org/project/linux-arm-kernel/patch/20210318082046.51546-14-benjamin.gaignard@collabora.com/ _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel