Re: [PATCH v8 4/6] kallsyms: introduce sections needed to map symbols to built-in modules
From: Masahiro Yamada <masahiroy@kernel.org>
Date: 2022-02-10 01:26:10
Also in:
linux-modules, lkml
On Wed, Feb 9, 2022 at 3:44 AM Nick Alcock [off-list ref] wrote:
quoted hunk ↗ jump to hunk
The mapping consists of three new symbols, computed by integrating the information in the (just-added) .tmp_vmlinux.ranges and modules_thick.builtin: taken together, they map address ranges (corresponding to object files on the input) to the names of zero or more modules containing those address ranges. - kallsyms_module_addresses/kallsyms_module_offsets encodes the address/offset of each object file (derived from the linker map), in exactly the same way as kallsyms_addresses/kallsyms_offsets does for symbols. There is no size: instead, the object files are assumed to tile the address space. (This is slightly more space-efficient than using a size). Non-text-section addresses are skipped: for now, all the users of this interface only need module/non-module information for instruction pointer addresses, not absolute-addressed symbols and the like. This restriction can easily be lifted in future. (Regarding the name: right now the entries correspond pretty closely to object files, so we could call the section kallsyms_objfiles or something, but the optimizer added in the next commit will change this.) - kallsyms_module_names encodes the name of each module in a modified form of strtab: notably, if an object file appears in *multiple* modules, all of which are built in, this is encoded via a zero byte, a one-byte module count, then a series of that many null-terminated strings. As a special case, the table starts with a single zero byte which does *not* represent the start of a multi-module list. - kallsyms_modules connects the two, encoding a table associated 1:1 with kallsyms_module_addresses / kallsyms_module_offsets, pointing at an offset in kallsyms_module_names describing which module (or modules, for a multi-module list) the code occupying this address range is part of. If an address range is part of no module (always built-in) it points at 0 (the null byte at the start of the kallsyms_module_names list). There is no optimization yet: kallsyms_modules and kallsyms_module_names will almost certainly contain many duplicate entries, and kallsyms_module_{addresses,offsets} may contain consecutive entries that point to the same place. The size hit is fairly substantial as a result, though still much less than a naive implementation mapping each symbol to a module name would be: 50KiB or so. Signed-off-by: Nick Alcock <redacted> Reviewed-by: Kris Van Hees <redacted> --- Makefile | 2 +- init/Kconfig | 8 + scripts/Makefile | 6 + scripts/kallsyms.c | 366 +++++++++++++++++++++++++++++++++++++++++++-- 4 files changed, 371 insertions(+), 11 deletions(-)diff --git a/Makefile b/Makefile index 5e823fe8390f..b719244cb571 100644 --- a/Makefile +++ b/Makefile@@ -1151,7 +1151,7 @@ cmd_link-vmlinux = \ $(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)"; \ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) -vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) FORCE +vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) modules_thick.builtin FORCE +$(call if_changed_dep,link-vmlinux) targets := vmlinuxdiff --git a/init/Kconfig b/init/Kconfig index e9119bf54b1f..e1ca3d70cb1c 100644 --- a/init/Kconfig +++ b/init/Kconfig@@ -1530,6 +1530,14 @@ config POSIX_TIMERS If unsure say y. +config KALLMODSYMS + default y + bool "Enable support for /proc/kallmodsyms" if EXPERT + depends on KALLSYMS + help + This option enables the /proc/kallmodsyms file, which maps symbols + to addresses and their associated modules. + config PRINTK default y bool "Enable support for printk" if EXPERTdiff --git a/scripts/Makefile b/scripts/Makefile index ce5aa9030b74..c5cc4ac3d660 100644 --- a/scripts/Makefile +++ b/scripts/Makefile@@ -29,6 +29,12 @@ ifdef CONFIG_BUILDTIME_MCOUNT_SORT HOSTCFLAGS_sorttable.o += -DMCOUNT_SORT_ENABLED endif +kallsyms-objs := kallsyms.o + +ifdef CONFIG_KALLMODSYMS +kallsyms-objs += modules_thick.o +endif + # The following programs are only built on demand hostprogs += unifdefdiff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index 54ad86d13784..8f87b724d0fa 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c@@ -5,7 +5,10 @@ * This software may be used and distributed according to the terms * of the GNU General Public License, incorporated herein by reference. * - * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols] > symbols.S + * Usage: nm -n vmlinux + * | scripts/kallsyms [--all-symbols] [--absolute-percpu] + * [--base-relative] [--builtin=modules_thick.builtin] + * > symbols.S * * Table compression uses all the unused char codes on the symbols and * maps these to the most used substrings (tokens). For instance, it might@@ -24,6 +27,10 @@ #include <string.h> #include <ctype.h> #include <limits.h> +#include <assert.h> +#include "modules_thick.h" + +#include "../include/generated/autoconf.h"
I do not remember if I had pointed this out before, but including autoconf.h from a host program is wrong. Do not use ifdef CONFIG_... in the hostprog code. Having --builtin=modules_thick.builtin is enough. -- Best Regards Masahiro Yamada