Thread (15 messages) 15 messages, 4 authors, 2024-09-20

Re: [PATCH v10 2/4] kbuild: generate offset range data for builtin modules

From: Kris Van Hees <hidden>
Date: 2024-09-19 21:02:13
Also in: linux-kbuild, linux-modules, lkml
Subsystem: documentation, kernel build + files below scripts/ (unless maintained elsewhere), the rest · Maintainers: Jonathan Corbet, Nathan Chancellor, Nicolas Schier, Linus Torvalds

On Thu, Sep 19, 2024 at 11:28:44PM +0900, Masahiro Yamada wrote:
Hi Kris,



On Tue, Sep 10, 2024 at 4:43 AM Kris Van Hees [off-list ref] wrote:
quoted
On Sun, Sep 08, 2024 at 11:50:51AM +0900, Masahiro Yamada wrote:
quoted
On Fri, Sep 6, 2024 at 11:45???PM Kris Van Hees [off-list ref] wrote:
quoted
Create file module.builtin.ranges that can be used to find where
built-in modules are located by their addresses. This will be useful for
tracing tools to find what functions are for various built-in modules.

The offset range data for builtin modules is generated using:
 - modules.builtin: associates object files with module names
 - vmlinux.map: provides load order of sections and offset of first member
    per section
 - vmlinux.o.map: provides offset of object file content per section
 - .*.cmd: build cmd file with KBUILD_MODFILE

The generated data will look like:

.text 00000000-00000000 = _text
.text 0000baf0-0000cb10 amd_uncore
.text 0009bd10-0009c8e0 iosf_mbi
...
.text 00b9f080-00ba011a intel_skl_int3472_discrete
.text 00ba0120-00ba03c0 intel_skl_int3472_discrete intel_skl_int3472_tps68470
.text 00ba03c0-00ba08d6 intel_skl_int3472_tps68470
...
.data 00000000-00000000 = _sdata
.data 0000f020-0000f680 amd_uncore

For each ELF section, it lists the offset of the first symbol.  This can
be used to determine the base address of the section at runtime.

Next, it lists (in strict ascending order) offset ranges in that section
that cover the symbols of one or more builtin modules.  Multiple ranges
can apply to a single module, and ranges can be shared between modules.

The CONFIG_BUILTIN_MODULE_RANGES option controls whether offset range data
is generated for kernel modules that are built into the kernel image.

How it works:

 1. The modules.builtin file is parsed to obtain a list of built-in
    module names and their associated object names (the .ko file that
    the module would be in if it were a loadable module, hereafter
    referred to as <kmodfile>).  This object name can be used to
    identify objects in the kernel compile because any C or assembler
    code that ends up into a built-in module will have the option
    -DKBUILD_MODFILE=<kmodfile> present in its build command, and those
    can be found in the .<obj>.cmd file in the kernel build tree.

    If an object is part of multiple modules, they will all be listed
    in the KBUILD_MODFILE option argument.

    This allows us to conclusively determine whether an object in the
    kernel build belong to any modules, and which.

 2. The vmlinux.map is parsed next to determine the base address of each
    top level section so that all addresses into the section can be
    turned into offsets.  This makes it possible to handle sections
    getting loaded at different addresses at system boot.

    We also determine an 'anchor' symbol at the beginning of each
    section to make it possible to calculate the true base address of
    a section at runtime (i.e. symbol address - symbol offset).

    We collect start addresses of sections that are included in the top
    level section.  This is used when vmlinux is linked using vmlinux.o,
    because in that case, we need to look at the vmlinux.o linker map to
    know what object a symbol is found in.

    And finally, we process each symbol that is listed in vmlinux.map
    (or vmlinux.o.map) based on the following structure:

    vmlinux linked from vmlinux.a:

      vmlinux.map:
        <top level section>
          <included section>  -- might be same as top level section)
            <object>          -- built-in association known
              <symbol>        -- belongs to module(s) object belongs to
              ...

    vmlinux linked from vmlinux.o:

      vmlinux.map:
        <top level section>
          <included section>  -- might be same as top level section)
            vmlinux.o         -- need to use vmlinux.o.map
              <symbol>        -- ignored
              ...

      vmlinux.o.map:
        <section>
            <object>          -- built-in association known
              <symbol>        -- belongs to module(s) object belongs to
              ...

 3. As sections, objects, and symbols are processed, offset ranges are
    constructed in a straight-forward way:

      - If the symbol belongs to one or more built-in modules:
          - If we were working on the same module(s), extend the range
            to include this object
          - If we were working on another module(s), close that range,
            and start the new one
      - If the symbol does not belong to any built-in modules:
          - If we were working on a module(s) range, close that range

Signed-off-by: Kris Van Hees <redacted>
Reviewed-by: Nick Alcock <redacted>
Reviewed-by: Alan Maguire <redacted>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Sam James <redacted>
---

If v10 is the final version, I offer to locally squash the following:
Thanks!  That would be great!  v10 is indeed the final version (see bwlow).
quoted
diff --git a/.gitignore b/.gitignore
index c06a3ef6d6c6..625bf59ad845 100644
--- a/.gitignore
+++ b/.gitignore
@@ -69,6 +69,7 @@ modules.order
 /Module.markers
 /modules.builtin
 /modules.builtin.modinfo
+/modules.builtin.ranges
 /modules.nsdeps

 #
diff --git a/Documentation/dontdiff b/Documentation/dontdiff
index 3c399f132e2d..a867aea95c40 100644
--- a/Documentation/dontdiff
+++ b/Documentation/dontdiff
@@ -180,6 +180,7 @@ modpost
 modules-only.symvers
 modules.builtin
 modules.builtin.modinfo
+modules.builtin.ranges
 modules.nsdeps
 modules.order
 modversions.h*
quoted
If Sami reports more errors and you end up with v11,
please remember to fold it.
Sami confirmed v10 [0].  Can you squash his reviewed-by and tested-by as well?

Thanks for all the help!

        Kris

[0] https://lore.kernel.org/lkml/20240909191801.GA398180@google.com/ (local)




Can you please add a small explanation to
Documentation/kbuild/kbuild.rst ?


It documents modules.order, modules.builtin, modules.builtin.modinfo.

Having modules.builtin.ranges there will keep the consistency.



You do not need to re-submit the entire patch.

If you provide a diff in a few days,
I will locally squash it.
Thank you for offering to locally squash the diff.

	Kris

diff --git a/Documentation/kbuild/kbuild.rst b/Documentation/kbuild/kbuild.rst
index 9c8d1d046ea5..142be0c74761 100644
--- a/Documentation/kbuild/kbuild.rst
+++ b/Documentation/kbuild/kbuild.rst
@@ -22,6 +22,11 @@ modules.builtin.modinfo
 This file contains modinfo from all modules that are built into the kernel.
 Unlike modinfo of a separate module, all fields are prefixed with module name.
 
+modules.builtin.ranges
+----------------------
+This file contains address offset ranges (per ELF section) for all modules
+that are built into the kernel.  Together with System.map, it can be used
+to associate module names with symbols.
 
 Environment variables
 =====================
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help