Thread (45 messages) 45 messages, 9 authors, 2025-10-01

Re: netconsole: HARDIRQ-safe -> HARDIRQ-unsafe lock order warning

From: Breno Leitao <leitao@debian.org>
Date: 2025-09-10 18:23:13
Also in: lkml

On Wed, Sep 10, 2025 at 02:28:40PM +0206, John Ogness wrote:
On 2025-09-09, Breno Leitao [off-list ref] wrote:
quoted
  b) Send the message anyway (and hope for the best)
    Cons: Netpoll will continue to call IRQ unsafe locks from IRQ safe
          context (lockdep will continue to be unhappy)
    Pro: This is how it works today already, so, it is not making the problem worse.
         In fact, it is narrowing the problem to only .write_atomic().
Two concerns here:

1. ->write_atomic() is also used during normal operation

2. It is expected that ->write_atomic() callbacks are implemented
   safely. The other nbcon citizens are doing this. Having an nbcon
   driver with an unsafe ->write_atomic() puts all nbcon drivers at risk
   of not functioning during panic.

This could be combined with (a) so that ->write_atomic() implements its
own deferred queue of messages to print and only when
@legacy_allow_panic_sync is true, will it try to send immediately and
hope for the best. @legacy_allow_panic_sync is set after all nbcon
drivers have had a chance to flush their buffers safely and then the
kernel starts to allow less safe drivers to flush.

Although I would prefer the NBCON_ATOMIC_UNSAFE approach instead.
Agree. That seems a more straight forward solution for drivers, and it
is clearly a solution that would help netconsole case.
quoted
  c) Not implementing .write_atomic
    Cons: we lose the most important messages of the boot.

  Any other option I am not seeing?
d) Not implementing ->write_atomic() and instead implement a kmsg_dumper
   for netconsole. This registers a callback that is called during
   panic.

   Con: The kmsg_dumper interface has nothing to do with consoles, so it
        would require some effort coordinating with the console drivers.
I am looking at kmsg_dumper interface, and it doesn't have the buffers
that need to be dumper.

So, if I understand corect, my kmsg_dumper callback needs to handle loop
into the messages buffer and print the remaining messages, right?

In other words, do I need to track what messages were sent in
netconsole, and then iterate in the kmsgs buffer 
to find messages that hasn't been sent, and send from there?
   Pro: There is absolute freedom for the dumper to implement its own
        panic-only solution to get messages out.
What about calls to .write_atomic() calls that are not called during
panic? Will those be lost in this approach?
e) Involve support from the underlying network drivers to implement true
   atomic sending. Thomas Gleixner talked [0] very briefly about how
   this could be implemented for netconsole during the 2022
   proof-of-concept presentation of the nbcon API.

   Cons: It most likely requires new API callbacks for the network
         drivers to implement hardware-specific solutions. Many (most?)
         drivers would not be able to support it.

   Pro: True reliable atomic printing via network.
That would make more sense, but, it seems deciding about it is above my
pay grade. :-)

Thanks for helping us with this issue,
--breno
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help