Re: netconsole: HARDIRQ-safe -> HARDIRQ-unsafe lock order warning
From: Breno Leitao <leitao@debian.org>
Date: 2025-08-20 17:36:13
Also in:
lkml
On Wed, Aug 20, 2025 at 02:31:02PM +0200, Mike Galbraith wrote:
On Tue, 2025-08-19 at 10:27 -0700, Breno Leitao wrote:quoted
I’ve continued investigating possible solutions, and it looks like moving netconsole over to the non‑blocking console (nbcon) framework might be the right approach. Unlike the classic console path, nbcon doesn’t rely on the global console lock, which was one of the main concerns regarding the possible deadlock.ATM at least, classic can remotely log a crash whereas nbcon can't per test drive, so it would be nice for classic to stick around until nbcon learns some atomic packet blasting.
Oh, does it mean that during crash nbcon invokes `write_atomic` call back, and because this patch doesn't implement it, it will not send those pkts? Am I reading it correct?
quoted
The new path is protected by NETCONSOLE_NBCON, which is disabled by default. This allows us to experiment and test both approaches.As patch sits, interrupts being disabled is still a problem, gripes below.
You mean that the IRQs are disabled at the acquire of target_list_lock? If so, an option is to turn that list an RCU list ?!
Not disabling IRQs makes nbcon gripe free, but creates the issue of netpoll_tx_running() lying to the rest of NETPOLL consumers. RT and the wireless stack have in common that IRQs being disabled in netpoll.c sucks rocks for them. I've been carrying a hack to allow RT to use netconsole since 5.15, and adapted it to squelch nbcons inspired gripes as well (had to whack irqsave/restore in your patch as well). Once the dust settles, perhaps RT can simply select NETCONSOLE_NBCON to solve its netconsole woes for free.
What is this patch you have? Thanks --breno