Re: [Intel-wired-lan] [PATCH 00/38] docs: several improvements to kernel-doc
From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Date: 2026-03-04 12:20:14
Also in:
intel-wired-lan, linux-doc, linux-hardening, lkml
On Wed, 04 Mar 2026 12:07:45 +0200 Jani Nikula [off-list ref] wrote:
On Mon, 23 Feb 2026, Jonathan Corbet [off-list ref] wrote:quoted
Jani Nikula [off-list ref] writes:quoted
There's always the question, if you're putting a lot of effort into making kernel-doc closer to an actual C parser, why not put all that effort into using and adapting to, you know, an actual C parser?Not speaking to the current effort but ... in the past, when I have contemplated this (using, say, tree-sitter), the real problem is that those parsers simply strip out the comments. Kerneldoc without comments ... doesn't work very well. If there were a parser without those problems, and which could be made to do the right thing with all of our weird macro usage, it would certainly be worth considering.I think e.g. libclang and its Python bindings can be made to work. The main problems with that are passing proper compiler options (because it'll need to include stuff to know about types etc. because it is a proper parser), preprocessing everything is going to take time, you need to invest a bunch into it to know how slow exactly compared to the current thing and whether it's prohitive, and it introduces an extra dependency.
It is not just that. Assume we're parsing something like this: static __always_inline int _raw_read_trylock(rwlock_t *lock) __cond_acquires_shared(true, lock); using a cpp (or libclang). We would need to define/undefine 3 symbols: #if defined(WARN_CONTEXT_ANALYSIS) && !defined(__CHECKER__) && !defined(__GENKSYMS__) (in this particular case, the default is OK, but on others, it may not be) This is by far more complex than just writing a logic that would convert the above into: static int _raw_read_trylock(rwlock_t *lock); which is the current kernel-doc approach. - Using a C preprocessor, we might have a very big prototype - and even have arch-specific defines affecting it, as some includes may be inside arch/*/include. So, we would need a kernel-doc ".config" file with a set of defines that can be hard to maintain.
So yeah, there are definitely tradeoffs there. But it's not like this constant patching of kernel-doc is exactly burden free either. I don't know, is it just me, but I'd like to think as a profession we'd be past writing ad hoc C parsers by now.
I'd say that the binding logic and the ".config" kernel-doc defines will be complex to maintain. Maybe more complex than kernel-doc patching and a simple C parser, like the one on my test.
quoted
On Mon, 23 Feb 2026 15:47:00 +0200 Jani Nikula [off-list ref] wrote:quoted
There's always the question, if you're putting a lot of effort into making kernel-doc closer to an actual C parser, why not put all that effort into using and adapting to, you know, an actual C parser?Playing with this idea, it is not that hard to write an actual C parser - or at least a tokenizer.Just for the record, I suggested using an existing parser, not going all NIH and writing your own.
I know, but I suspect that a simple tokenizer similar to my example might do the job without any major impact, but yeah, tests are needed. -- Thanks, Mauro