Re: [PATCH net-next 0/2] net: netconsole: convert to NBCON console infrastructure
From: John Ogness <john.ogness@linutronix.de>
Date: 2026-01-09 15:13:33
Also in:
lkml
On 2026-01-09, Petr Mladek [off-list ref] wrote:
If we were adding this info, I would add it for all users, definitely. We are going to touch the internal printk ringbuffer format anyway.
Which could mean we are going to need a round of updating crash/dump tools. So we should choose wisely.
quoted hunk ↗ jump to hunk
Now, the main question is how far we want to go. I see the following possibilities: A) caller_id -> pid + cpu + atomic/task context bit =================================================== Replace the current .caller_id and always save both PID and CPU number. Something like:diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h index 4ef81349d9fb..3c7635ada6dd 100644 --- a/kernel/printk/printk_ringbuffer.h +++ b/kernel/printk/printk_ringbuffer.h@@ -9,6 +9,12 @@ #include <linux/stddef.h> #include <linux/types.h> + +struct printk_caller { + pid_t pid; /* caller pid */ + int cpu; /* caller CPU number */ +}; + /* * Meta information about each stored message. *@@ -22,8 +29,8 @@ struct printk_info { u8 facility; /* syslog facility */ u8 flags:5; /* internal record flags */ u8 level:3; /* syslog level */ - u32 caller_id; /* thread id or processor id */ + struct printk_caller caller; struct dev_printk_info dev_info; };Plus the task/interrupt bit might be stored either as the highest bit in .pid or by adding LOG_CALLER_IN_TASK bit into "enum printk_info_flags".
The thing I find attractive about this solution is that it could be done such that crash/dump tools must not be changed. We could leave the semantics for @caller_id as is and simply add @cpu:
--- a/kernel/printk/printk_ringbuffer.h
+++ b/kernel/printk/printk_ringbuffer.h@@ -23,6 +23,7 @@ struct printk_info { u8 flags:5; /* internal record flags */ u8 level:3; /* syslog level */ u32 caller_id; /* thread id or processor id */ + u32 cpu; /* processor id */ struct dev_printk_info dev_info; };
After all, if the caller is not in_task() then the new @pid would be meaningless anyway. If we are willing to accept printer-resolution of task names, then this simple addition would be good enough for netconsole, while not requiring any crash/dump tool updates. This would buy us time to think more seriously about a significant overhaul.
quoted hunk ↗ jump to hunk
B) caller_id -> pid + cpu + contex ================================== Same as above but the caller context info is stored separately and might allow to distinguish more types. Something like:diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h index 4ef81349d9fb..44aa60a3f84c 100644 --- a/kernel/printk/printk_ringbuffer.h +++ b/kernel/printk/printk_ringbuffer.h@@ -9,6 +9,19 @@ #include <linux/stddef.h> #include <linux/types.h> +#define PRINTK_CALLER_CTXT_PREEMPT_BIT 1 +#define PRINTK_CALLER_CTXT_SOFTIRQ_BIT 2 +#define PRINTK_CALLER_CTXT_HARDIRQ_BIT 3 +#define PRINTK_CALLER_CTXT_NMI 4 +/* This is not supported on all platforms, see irqs_disabled() */ +#define PRINTK_CALLER_CTXT_IRQS_DISABLED 5 + +struct printk_caller { + u8 ctxt; /* caller context: task, irq, ... */ + pid_t pid; /* caller pid */ + int cpu; /* caller CPU number */ +}; + /* * Meta information about each stored message. *@@ -22,8 +35,8 @@ struct printk_info { u8 facility; /* syslog facility */ u8 flags:5; /* internal record flags */ u8 level:3; /* syslog level */ - u32 caller_id; /* thread id or processor id */ + struct printk_caller caller; struct dev_printk_info dev_info; };
Just as with A, here we could also preserve @caller_id semantics (instead of introducing @pid) to avoid crash/dump tool updates. A new @ctxt field is only useful if it can be seen. I am not sure how you plan on showing this. By extending the prefix like caller_id does? Would it be a new kernel config or just use CONFIG_PRINTK_CALLER? Either way, we are talking about extending visible log data, which is something that needs a discussion on its own.
C) caller_id -> pid + cpu + comm + atomic/task context bit ========================================================== Similar to A and B but add also char comm[TASK_COMM_LEN]; into struct printk_caller.
Just as with A and B, we could preserve @caller_id semantics. This is the simplest solution since the printer would not need to perform any resolution at all and it would be 100% accurate (when in_task).
quoted hunk ↗ jump to hunk
D) caller_id -> pid + cpu + comm + atomic/task context bit and move printk_caller + printk_dev_info into text buffer as suggested as mentioned at https://lore.kernel.org/lkml/84y10vz7ty.fsf@jogness.linutronix.de (local) ==================================================================== There are many possibilities how to do it. And it involves two problems: a) how to store the data into data buffer b) how to integrate this in the ring buffer API I thought about several approaches and realized that it still would make sense to keep: + binary data (pid, cpu, ctxt) in desc ring + text data (comm, subsystem, device) in text_data buffer Something like:diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h index 4ef81349d9fb..41232bf2919d 100644 --- a/kernel/printk/printk_ringbuffer.h +++ b/kernel/printk/printk_ringbuffer.h@@ -9,6 +9,27 @@ #include <linux/stddef.h> #include <linux/types.h> +#define PRINTK_CALLER_CTXT_PREEMPT_BIT 0 +#define PRINTK_CALLER_CTXT_SOFTIRQ_BIT 1 +#define PRINTK_CALLER_CTXT_HARDIRQ_BIT 2 +#define PRINTK_CALLER_CTXT_NMI 3 +/* This is not supported on all platforms, see irqs_disabled() */ +#define PRINTK_CALLER_CTXT_IRQS_DISABLED 4 + +struct printk_caller { + u8 ctxt; /* caller context: task, irq, ... */ + pid_t pid; /* caller pid */ + int cpu; /* caller CPU number */ +}; + +/* + * Describe which text information about the caller + * is stored in text_data_ring + */ +#define PRINTK_CALLER_BITS_COMM_BIT 0 +#define PRINTK_CALLER_BITS_SUBSYSTEM_BIT 1 +#define PRINTK_CALLER_BITS_DEVICE_BIT 2 + /* * Meta information about each stored message. *@@ -22,9 +43,8 @@ struct printk_info { u8 facility; /* syslog facility */ u8 flags:5; /* internal record flags */ u8 level:3; /* syslog level */ - u32 caller_id; /* thread id or processor id */ - - struct dev_printk_info dev_info; + u8 caller_bits_size; /* size of the caller info in data buffer */ + u8 caller_bits; /* bits describing caller in data buffer */ }; /*The text would be stored in the test_data_ring in the following format: [<comm>\0][<subsystem>\0][<device\0]<text>\0
I would prefer: <text>\0[<subsystem>\0][<device>\0][<comm>\0] Crash/dump tools that are not updated would at least continue to print the text at the beginning of the line. Here are a few projects that I track and how they would react (unmodified): makedumpfile - https://github.com/makedumpfile/makedumpfile/blob/master/printk.c#L134 - will print "\x00" between the fields vmcore-dmesg - https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/tree/util_lib/elf_info.c?h=main#n904 - will print "\x00" between the fields crash - https://github.com/crash-utility/crash/blob/master/printk.c#L210 - will print "." between the fields
aka, strings separated by '\0'. The information about the caller
would be optional where:
+ .caller_bits_size would contain the size of the optional
pieces, including the trailing '\0's.I do not know if we would need this. @text_len could include the caller area as well.
+ .caller_bits number would define which particular pieces
are there. It would be even somehow extensible. The crash
tool could ignore parts which are not supported.Yes, this works well.
My opinion: =========== I personally think that the approach B) is the best compromise for now because: 1. We really should distinguish task/interrupt context. IMHO, it is a very useful information provided by the caller_id now. The approach A) stores the context a hacky way. The approach B) is cleaner and provides even more precise info. 2. The mapping between PID and COMM might get lost if we do not store it. But it should not be problem most of the time because we try to flush consoles ASAP. I would keep it simple for now. We could add it later when it becomes a problem in practice. BTW: I think that we could detect when the mapping is valid by comparing task->start_time with current time... Devil advocate: Adding comm[TASK_COMM_LEN] into struct printk_info might be acceptable. It is "just" 16 bytes in compare with 64 bytes for dev_printk.
I am OK with A, B, or C if we can keep the @caller_id semantics. This would not require any changes to the LOG_CONT implementation or any other exists buffer preparation code and it would not require and changes do crash/dump tools. I.e. it would simply be making new fields available for netconsole. I can accept that we want to avoid C for now until we solve the efficient space issue. (My official preference list is below...)
3. It would be nice to optimize the memory usage and store the optional and variable strings (comm, subsystem, device) into data buffer. But it looks like a non-trivial work. I would do this only when there is a real demand. And I haven't heard about that the current approach was not acceptable yet.
Well, the whole reason the topic came up is because of a complaint from Geert. And I think if people knew just how much space was being wasted, they might speak up. For example, on my Debian distro kernel, I have a 256KB text_data_ring and an 896KB desc_ring. I expect that the total usage would cut in half if we packed the dev_printk_info data into the text_data_ring. I can prototype some tests (just to satisfy my curiousity). I do not think it would be much work. Whether the strings are copied from text_data arrays or dev_printk_info structs does not make much difference. There is only a slightly larger overhead to identify their existance and offsets. (Currently it is a single memcpy() of dev_printk_info.) But we would need to update the crash/dump tools. So I am not in a rush to implement D right now. My ordered preferences for right now would be: 1. keeping @caller_id semantics + adding @cpu + adding @comm (similar to your C) 2. keeping @caller_id semantics + adding @cpu (similar to your A) These additions would be purely to support netconsole and the additions would be under CONFIG_NETCONSOLE_DYNAMIC. No crash/dump tools should be changed due to this. For the long term I would like to move all strings into the text_data_ring. We could use that opportunity to add context bits, time stamps, etc. and then appropriately update the major crash/dump tools. John