Thread (9 messages) 9 messages, 2 authors, 2024-08-20

Re: [PATCH v6 3/4] scripts: add verifier script for builtin module range data

From: Masahiro Yamada <masahiroy@kernel.org>
Date: 2024-08-18 06:41:14
Also in: linux-kbuild, linux-modules, lkml

On Fri, Aug 16, 2024 at 12:04 AM Kris Van Hees [off-list ref] wrote:
quoted hunk ↗ jump to hunk
The modules.builtin.ranges offset range data for builtin modules is
generated at compile time based on the list of built-in modules and
the vmlinux.map and vmlinux.o.map linker maps.  This data can be used
to determine whether a symbol at a particular address belongs to
module code that was configured to be compiled into the kernel proper
as a built-in module (rather than as a standalone module).

This patch adds a script that uses the generated modules.builtin.ranges
data to annotate the symbols in the System.map with module names if
their address falls within a range that belongs to one or more built-in
modules.

It then processes the vmlinux.map (and if needed, vmlinux.o.map) to
verify the annotation:

  - For each top-level section:
     - For each object in the section:
        - Determine whether the object is part of a built-in module
          (using modules.builtin and the .*.cmd file used to compile
           the object as suggested in [0])
        - For each symbol in that object, verify that the built-in
          module association (or lack thereof) matches the annotation
          given to the symbol.

Signed-off-by: Kris Van Hees <redacted>
Reviewed-by: Nick Alcock <redacted>
Reviewed-by: Alan Maguire <redacted>
---
    Changes since v5:
     - Added optional 6th argument to specify kernel build directory.
     - Report error and exit if .*.o.cmd files cannot be read.

    Changes since v4:
     - New patch in the series
---
 scripts/verify_builtin_ranges.awk | 365 ++++++++++++++++++++++++++++++
 1 file changed, 365 insertions(+)
 create mode 100755 scripts/verify_builtin_ranges.awk
diff --git a/scripts/verify_builtin_ranges.awk b/scripts/verify_builtin_ranges.awk
new file mode 100755
index 000000000000..b82cf0a0fbeb
--- /dev/null
+++ b/scripts/verify_builtin_ranges.awk
@@ -0,0 +1,365 @@
+#!/usr/bin/gawk -f
+# SPDX-License-Identifier: GPL-2.0
+# verify_builtin_ranges.awk: Verify address range data for builtin modules
+# Written by Kris Van Hees <kris.van.hees@oracle.com>
+#
+# Usage: verify_builtin_ranges.awk modules.builtin.ranges System.map \
+#                                 modules.builtin vmlinux.map vmlinux.o.map \
+#                                 [ <build-dir> ]
+#
+
+# Return the module name(s) (if any) associated with the given object.
+#
+# If we have seen this object before, return information from the cache.
+# Otherwise, retrieve it from the corresponding .cmd file.
+#
+function get_module_info(fn, mod, obj, mfn, s) {
+       if (fn in omod)
+               return omod[fn];
+
+       if (match(fn, /\/[^/]+$/) == 0)
+               return "";
+
+       obj = fn;
+       mod = "";
+       mfn = "";
+       fn = kdir "/" substr(fn, 1, RSTART) "." substr(fn, RSTART + 1) ".cmd";
+       if (getline s <fn == 1) {
+               if (match(s, /DKBUILD_MODFILE=['"]+[^'"]+/) > 0) {
+                       mfn = substr(s, RSTART + 16, RLENGTH - 16);
+                       gsub(/['"]/, "", mfn);
+
+                       mod = mfn;
+                       gsub(/([^/ ]*\/)+/, "", mod);
+                       gsub(/-/, "_", mod);
+               }
+       } else {
+               print "ERROR: Failed to read: " fn "\n\n" \
+                     "  Invalid kernel build directory (" kdir ")\n" \
+                     "  or its content does not match " ARGV[1] >"/dev/stderr";
+               close(fn);
+               total = 0;
+               exit(1);
+       }
+       close(fn);
+
+       # A single module (common case) also reflects objects that are not part
+       # of a module.  Some of those objects have names that are also a module
+       # name (e.g. core).  We check the associated module file name, and if
+       # they do not match, the object is not part of a module.
+       if (mod !~ / /) {
+               if (!(mod in mods))
+                       return "";
+               if (mods[mod] != mfn)
+                       return "";
+       }
+
+       # At this point, mod is a single (valid) module name, or a list of
+       # module names (that do not need validation).
+       omod[obj] = mod;
+       close(fn);
+
+       return mod;
+}


This code is copy-paste from scripts/generate_builtin_ranges.awk
So, my comments in 2/4 can apply to this patch, too.


Instead of adding a separate script,
we could add a "verify mode" option.


 scripts/generate_builtin_ranges.awk --verify ...


But, I do not know how much cleaner it will become.

I am not good at reviewing AWK code, but this
is how you go.




If this script were written in Python,
it would be easy and readable to
split logically-related code chunks into functions,
as follows:


def parse_module_builtin():
    ...


def parse_vmlinux_map_lld():
    ...


def parse_vmlinux_map_bfd():
    ...


def parse_vmlinux_o_map():
    ...



--
Best Regards
Masahiro Yamada
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help