Thread (128 messages) 128 messages, 9 authors, 2022-09-24

Re: [PATCH v2 00/41] drm: Analog TV Improvements

From: Noralf Trønnes <hidden>
Date: 2022-09-07 16:45:38
Also in: dri-devel, intel-gfx, linux-sunxi, lkml, nouveau


Den 07.09.2022 12.36, skrev Stefan Wahren:
Hi Maxime,

Am 05.09.22 um 16:57 schrieb Maxime Ripard:
quoted
On Fri, Sep 02, 2022 at 01:28:16PM +0200, Noralf Trønnes wrote:
quoted
Den 01.09.2022 21.35, skrev Noralf Trønnes:
quoted
I have finally found a workaround for my kernel hangs.

Dom had a look at my kernel and found that the VideoCore was fine, and
he said this:
quoted
That suggests cause of lockup was on arm side rather than VC side.

But it's hard to diagnose further. Once you've had a peripheral not
respond, the AXI bus locks up and no further operations are possible.
Usual causes of this are required clocks being stopped or domains
disabled and then trying to access the hardware.
So when I got this on my 64-bit build:

[  166.702171] SError Interrupt on CPU1, code 0x00000000bf000002 --
SError
[  166.702187] CPU: 1 PID: 8 Comm: kworker/u8:0 Tainted: G        W
     5.19.0-rc6-00096-gba7973977976-dirty #1
[  166.702200] Hardware name: Raspberry Pi 4 Model B Rev 1.1 (DT)
[  166.702206] Workqueue: events_freezable_power_
thermal_zone_device_check
[  166.702231] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[  166.702242] pc : regmap_mmio_read32le+0x10/0x28
[  166.702261] lr : regmap_mmio_read+0x44/0x70
...
[  166.702606]  bcm2711_get_temp+0x58/0xb0 [bcm2711_thermal]

I wondered if that reg read was stalled due to a clock being stopped.

Lo and behold, disabling runtime pm and keeping the vec clock running
all the time fixed it[1].

I don't know what the problem is, but at least I can now test this
patchset.

[1] https://gist.github.com/notro/23b984e7fa05cfbda2db50a421cac065
It turns out I didn't have to disable runtime pm:
https://gist.github.com/notro/0adcfcb12460b54e54458afe11dc8ea2
If the bcm2711_thermal IP needs that clock to be enabled, it should grab
a reference itself, but it looks like even the device tree binding
doesn't ask for one.
The missing clock in the device tree binding is expected, because
despite of the code there is not much information about the BCM2711
clock tree. But i'm skeptical that the AVS IP actually needs the VEC
clock, i think it's more likely that the VEC clock parent is changed and
that cause this issue. I could take care of the bcm2711 binding & driver
if i know which clock is really necessary.
Seems you're right, keeping the parent always enabled is enough:

	clk_prepare_enable(clk_get_parent(vec->clock)); // pllc_per

I tried enabling just the grandparent clock as well, but that didn't help.

Without the clock hack it seems the hang occurs when switching between
NTSC and PAL, at most I've been able to do that 4-5 times before it hangs.

For a while it looked like fbdev/fbcon had a play in this, but then I
realised that it just gave me a NTSC mode to start from and to go back
to when qutting modetest.

Noralf.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help