Re: [Intel-wired-lan] [PATCH 00/38] docs: several improvements to kernel-doc

[PATCH 00/38] docs: several improvements to kernel-doc · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 01/38] docs: kdoc_re: add support for groups() · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 02/38] docs: kdoc_re: don't go past the end of a line · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 03/38] docs: kdoc_parser: move var transformers to the beginning · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 04/38] docs: kdoc_parser: don't mangle with function defines · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 05/38] docs: kdoc_parser: add functions support for NestedMatch · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 06/38] docs: kdoc_parser: use NestedMatch to handle __attribute__ on functions · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 08/38] docs: kdoc_parser: fix the default_value logic for variables · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 07/38] docs: kdoc_parser: fix variable regexes to work with size_t · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 10/38] docs: kdoc_parser: don't exclude defaults from prototype · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 09/38] docs: kdoc_parser: add some debug for variable parsing · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 11/38] docs: kdoc_parser: fix parser to support multi-word types · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 13/38] docs: kdoc_parser: add support for LIST_HEAD · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 12/38] docs: kdoc_parser: ignore context analysis and lock attributes · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 15/38] docs: kdoc_re: properly handle strings and escape chars on it · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 14/38] docs: kdoc_parser: handle struct member macro VIRTIO_DECLARE_FEATURES(name) · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 16/38] docs: kdoc_re: better show KernRe() at documentation · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 18/38] docs: kdoc_re: Change NestedMath args replacement to \0 · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 17/38] docs: kdoc_re: don't recompile NestedMatch regex every time · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 21/38] docs: kdoc_parser: better handle struct_group macros · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 20/38] docs: kdoc_re: add support on NestedMatch for argument replacement · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 19/38] docs: kdoc_re: make NestedMatch use KernRe · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 22/38] docs: kdoc_re: fix a parse bug on struct page_pool_params · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 23/38] docs: kdoc_re: add a helper class to declare C function matches · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 25/38] docs: kdoc_parser: minimize differences with struct_group_tagged · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 24/38] docs: kdoc_parser: use the new CFunction class · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 27/38] docs: kdoc_re: don't remove the trailing ";" with NestedMatch · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 28/38] docs: kdoc_re: prevent adding whitespaces on sub replacements · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 29/38] docs: xforms_lists.py: use CFuntion to handle all function macros · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 26/38] docs: kdoc_parser: move transform lists to a separate file · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 30/38] docs: kdoc_files: allows the caller to use a different xforms class · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 31/38] docs: kdoc_re: Fix NestedMatch.sub() which causes PDF builds to break · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 33/38] docs: kdoc_output: add optional args to ManOutput class · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 32/38] docs: kdoc_files: document KernelFiles() ABI · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 34/38] docs: sphinx-build-wrapper: better handle troff .TH markups · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 36/38] docs: sphinx-build-wrapper: don't allow "/" on file names · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 35/38] docs: kdoc_output: use a more standard order for .TH on man pages · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 37/38] docs: kdoc_output: describe the class init parameters · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
[PATCH 38/38] docs: kdoc_output: pick a better default for modulename · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-18
Re: [PATCH 00/38] docs: several improvements to kernel-doc · Randy Dunlap <hidden> · 2026-02-21
Re: [PATCH 00/38] docs: several improvements to kernel-doc · Randy Dunlap <hidden> · 2026-02-22
Re: [PATCH 00/38] docs: several improvements to kernel-doc · Jani Nikula <jani.nikula@linux.intel.com> · 2026-02-23
Re: [PATCH 00/38] docs: several improvements to kernel-doc · Jonathan Corbet <corbet@lwn.net> · 2026-02-23
Re: [Intel-wired-lan] [PATCH 00/38] docs: several improvements to kernel-doc · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-02-24
Re: [PATCH 00/38] docs: several improvements to kernel-doc · Jani Nikula <jani.nikula@linux.intel.com> · 2026-03-04
Re: [Intel-wired-lan] [PATCH 00/38] docs: several improvements to kernel-doc · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-03-04
Re: [PATCH 00/38] docs: several improvements to kernel-doc · Jonathan Corbet <corbet@lwn.net> · 2026-03-04
Re: [Intel-wired-lan] [PATCH 00/38] docs: several improvements to kernel-doc · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-03-13
Re: [PATCH 00/38] docs: several improvements to kernel-doc · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-03-03
RE: [PATCH 00/38] docs: several improvements to kernel-doc · Loktionov, Aleksandr <hidden> · 2026-03-03
Re: [Intel-wired-lan] [PATCH 00/38] docs: several improvements to kernel-doc · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-03-03
Re: [PATCH 00/38] docs: several improvements to kernel-doc · Jani Nikula <jani.nikula@linux.intel.com> · 2026-03-04
Re: [PATCH 00/38] docs: several improvements to kernel-doc · Jonathan Corbet <corbet@lwn.net> · 2026-02-23
Re: [Intel-wired-lan] [PATCH 00/38] docs: several improvements to kernel-doc · Mauro Carvalho Chehab <mchehab+huawei@kernel.org> · 2026-03-02
Re: [Intel-wired-lan] [PATCH 00/38] docs: several improvements to kernel-doc · Jonathan Corbet <corbet@lwn.net> · 2026-03-02

From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Date: 2026-03-13 10:48:49
Also in: intel-wired-lan, linux-doc, linux-hardening, lkml

On Wed, 04 Mar 2026 12:07:45 +0200
Jani Nikula [off-list ref] wrote:

On Mon, 23 Feb 2026, Jonathan Corbet [off-list ref] wrote:

quoted

Jani Nikula [off-list ref] writes:

quoted

There's always the question, if you're putting a lot of effort into
making kernel-doc closer to an actual C parser, why not put all that
effort into using and adapting to, you know, an actual C parser?

Not speaking to the current effort but ... in the past, when I have
contemplated this (using, say, tree-sitter), the real problem is that
those parsers simply strip out the comments.  Kerneldoc without comments
... doesn't work very well.  If there were a parser without those
problems, and which could be made to do the right thing with all of our
weird macro usage, it would certainly be worth considering.

I think e.g. libclang and its Python bindings can be made to work. The
main problems with that are passing proper compiler options (because
it'll need to include stuff to know about types etc. because it is a
proper parser), preprocessing everything is going to take time, you need
to invest a bunch into it to know how slow exactly compared to the
current thing and whether it's prohitive, and it introduces an extra
dependency.

So yeah, there are definitely tradeoffs there. But it's not like this
constant patching of kernel-doc is exactly burden free either.

On my tests with a simple C tokenizer:

	https://lore.kernel.org/linux-doc/cover.1773326442.git.mchehab+huawei@kernel.org/ (local)

The tokenizer is working fine and didn't make it much slow: it
increases the time to pass the entire Kernel tree from 37s to 47s
for man pages generation, but should not change much the time for
htmldocs, as right now only ~4 seconds is needed to read files
pointed by Documentation kernel-doc tags and parse them.

The code can still be cleaned up, as there are still some things
hardcoded on the various dump_* functions that could be better
implemented (*).

The advantage of the approach I'm using is that it allows to
gradually migrate to rely at the tokenized code, as it can be done
incrementally.

(*) for instance, __attribute__ and a couple of other macros are parsed
    twice at dump_struct() logic, on different places.

I don't
know, is it just me, but I'd like to think as a profession we'd be past
writing ad hoc C parsers by now.

Probably not, but I don't think we need a C parser, as kernel-doc
just needs to understand data types (enum, struct, typedef, union,
vars) and function/macro prototypes.

For such purpose, a tokenizer sounds enough.

Now, there is the code that it is now inside:
	https://github.com/mchehab/linux/blob/tokenizer-v5/tools/lib/python/kdoc/xforms_lists.py

which contains a list of C/gcc/clang keywords that will
be ignored, like:

	__attribute__
	static
	extern
	inline

Together with a sanitized version of the kernel macros it needs
to handle or ignore:

	DECLARE_BITMAP
	DECLARE_HASHTABLE
 	__acquires
	__init
	__exit
	struct_group
	...


Once we finish cleaning up kdoc_parser.py to rely only
on it for prototype transformations, this will be the only file
that will require changes when more macros start affecting 
kernel-doc.

As this is complex, and may require manual adjustments, it
is probably better to not try to auto-generate xforms list
in runtime. A better approach is, IMO, to have a C pre-processor
code to help periodically update it, like using a target like:

	make kdoc-xforms

that would use either cpp or clang to generate a patch to
update xforms_list content after adding new macros that
affect docs generation.

-- 
Thanks,
Mauro

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help