Thread (10 messages) 10 messages, 3 authors, 2022-02-11

Re: [PATCH v8 4/6] kallsyms: introduce sections needed to map symbols to built-in modules

From: Masahiro Yamada <masahiroy@kernel.org>
Date: 2022-02-10 01:26:10
Also in: linux-modules, lkml

On Wed, Feb 9, 2022 at 3:44 AM Nick Alcock [off-list ref] wrote:
quoted hunk ↗ jump to hunk
The mapping consists of three new symbols, computed by integrating the
information in the (just-added) .tmp_vmlinux.ranges and
modules_thick.builtin: taken together, they map address ranges
(corresponding to object files on the input) to the names of zero or
more modules containing those address ranges.

 - kallsyms_module_addresses/kallsyms_module_offsets encodes the
   address/offset of each object file (derived from the linker map), in
   exactly the same way as kallsyms_addresses/kallsyms_offsets does
   for symbols.  There is no size: instead, the object files are assumed
   to tile the address space.  (This is slightly more space-efficient
   than using a size).  Non-text-section addresses are skipped: for now,
   all the users of this interface only need module/non-module
   information for instruction pointer addresses, not absolute-addressed
   symbols and the like.  This restriction can easily be lifted in
   future.  (Regarding the name: right now the entries correspond pretty
   closely to object files, so we could call the section
   kallsyms_objfiles or something, but the optimizer added in the next
   commit will change this.)

 - kallsyms_module_names encodes the name of each module in a modified
   form of strtab: notably, if an object file appears in *multiple*
   modules, all of which are built in, this is encoded via a zero byte,
   a one-byte module count, then a series of that many null-terminated
   strings.  As a special case, the table starts with a single zero byte
   which does *not* represent the start of a multi-module list.

 - kallsyms_modules connects the two, encoding a table associated 1:1
   with kallsyms_module_addresses / kallsyms_module_offsets, pointing
   at an offset in kallsyms_module_names describing which module (or
   modules, for a multi-module list) the code occupying this address
   range is part of.  If an address range is part of no module (always
   built-in) it points at 0 (the null byte at the start of the
   kallsyms_module_names list).

There is no optimization yet: kallsyms_modules and
kallsyms_module_names will almost certainly contain many duplicate
entries, and kallsyms_module_{addresses,offsets} may contain
consecutive entries that point to the same place.  The size hit is
fairly substantial as a result, though still much less than a naive
implementation mapping each symbol to a module name would be: 50KiB or
so.

Signed-off-by: Nick Alcock <redacted>
Reviewed-by: Kris Van Hees <redacted>
---
 Makefile           |   2 +-
 init/Kconfig       |   8 +
 scripts/Makefile   |   6 +
 scripts/kallsyms.c | 366 +++++++++++++++++++++++++++++++++++++++++++--
 4 files changed, 371 insertions(+), 11 deletions(-)
diff --git a/Makefile b/Makefile
index 5e823fe8390f..b719244cb571 100644
--- a/Makefile
+++ b/Makefile
@@ -1151,7 +1151,7 @@ cmd_link-vmlinux =                                                 \
        $(CONFIG_SHELL) $< "$(LD)" "$(KBUILD_LDFLAGS)" "$(LDFLAGS_vmlinux)";    \
        $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)

-vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) FORCE
+vmlinux: scripts/link-vmlinux.sh autoksyms_recursive $(vmlinux-deps) modules_thick.builtin FORCE
        +$(call if_changed_dep,link-vmlinux)

 targets := vmlinux
diff --git a/init/Kconfig b/init/Kconfig
index e9119bf54b1f..e1ca3d70cb1c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1530,6 +1530,14 @@ config POSIX_TIMERS

          If unsure say y.

+config KALLMODSYMS
+       default y
+       bool "Enable support for /proc/kallmodsyms" if EXPERT
+       depends on KALLSYMS
+       help
+         This option enables the /proc/kallmodsyms file, which maps symbols
+         to addresses and their associated modules.
+
 config PRINTK
        default y
        bool "Enable support for printk" if EXPERT
diff --git a/scripts/Makefile b/scripts/Makefile
index ce5aa9030b74..c5cc4ac3d660 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -29,6 +29,12 @@ ifdef CONFIG_BUILDTIME_MCOUNT_SORT
 HOSTCFLAGS_sorttable.o += -DMCOUNT_SORT_ENABLED
 endif

+kallsyms-objs  := kallsyms.o
+
+ifdef CONFIG_KALLMODSYMS
+kallsyms-objs += modules_thick.o
+endif
+
 # The following programs are only built on demand
 hostprogs += unifdef
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 54ad86d13784..8f87b724d0fa 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -5,7 +5,10 @@
  * This software may be used and distributed according to the terms
  * of the GNU General Public License, incorporated herein by reference.
  *
- * Usage: nm -n vmlinux | scripts/kallsyms [--all-symbols] > symbols.S
+ * Usage: nm -n vmlinux
+ *        | scripts/kallsyms [--all-symbols] [--absolute-percpu]
+ *             [--base-relative] [--builtin=modules_thick.builtin]
+ *        > symbols.S
  *
  *      Table compression uses all the unused char codes on the symbols and
  *  maps these to the most used substrings (tokens). For instance, it might
@@ -24,6 +27,10 @@
 #include <string.h>
 #include <ctype.h>
 #include <limits.h>
+#include <assert.h>
+#include "modules_thick.h"
+
+#include "../include/generated/autoconf.h"


I do not remember if I had pointed this out before,
but including autoconf.h from a host program is wrong.

Do not use ifdef CONFIG_...  in the hostprog code.
Having --builtin=modules_thick.builtin is enough.





-- 
Best Regards
Masahiro Yamada
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help