Re: [PATCH v2 2/4] llvm-cov: add Clang's MC/DC support
From: Nathan Chancellor <nathan@kernel.org>
Date: 2024-10-02 01:10:34
Also in:
linux-arch, linux-efi, linux-kbuild, linux-um, lkml, llvm
Hi Wentao, On Wed, Sep 04, 2024 at 11:32:43PM -0500, Wentao Zhang wrote:
Add infrastructure to enable Clang's Modified Condition/Decision Coverage
(MC/DC) [1].
Clang has added MC/DC support as of its 18.1.0 release. MC/DC is a fine-
grained coverage metric required by many automotive and aviation industrial
standards for certifying mission-critical software [2].
In the following example from arch/x86/events/probe.c, llvm-cov gives the
MC/DC measurement for the compound logic decision at line 43.
43| 12| if (msr[bit].test && !msr[bit].test(bit, data))
------------------
|---> MC/DC Decision Region (43:8) to (43:50)
|
| Number of Conditions: 2
| Condition C1 --> (43:8)
| Condition C2 --> (43:25)
|
| Executed MC/DC Test Vectors:
|
| C1, C2 Result
| 1 { T, F = F }
| 2 { T, T = T }
|
| C1-Pair: not covered
| C2-Pair: covered: (1,2)
| MC/DC Coverage for Decision: 50.00%
|
------------------
44| 5| continue;
As the results suggest, during the span of measurement, only condition C2
(!msr[bit].test(bit, data)) is covered. That means C2 was evaluated to both
true and false, and in those test vectors C2 affected the decision outcome
independently. Therefore MC/DC for this decision is 1 out of 2 (50.00%).Thanks a lot for the detail in the commit message. Your first talk at LPC in the Refereed Track was excellent as well. If the video for that talk becomes available soon, it would be helpful to link that in the commit message as well.
As of Clang 19, users can determine the max number of conditions in a decision to measure via option LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS, which controls -fmcdc-max-conditions flag of Clang cc1 [3]. Since MC/DC implementation utilizes bitmaps to track the execution of test vectors, more memory is consumed if larger decisions are getting counted. The
Some of this could potentially be in the Kconfig text below as it seems relevant for users to make a decision on modifying its value.
maximum value supported by Clang is 32767. According to local experiments, the working maximum for Linux kernel is 46, with the largest decisions in kernel codebase (with 47 conditions, as of v6.11) excluded, otherwise the kernel image size limit will be exceeded. The largest decisions in kernel are contributed for example by macros checking CPUID. Code exceeding LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS will produce compiler warnings. As of LLVM 19, certain expressions are still not covered, and will produce build warnings when they are encountered: "[...] if a boolean expression is embedded in the nest of another boolean expression but separated by a non-logical operator, this is also not supported. For example, in x = (a && b && c && func(d && f)), the d && f case starts a new boolean expression that is separated from the other conditions by the operator func(). When this is encountered, a warning will be generated and the boolean expression will not be instrumented." [4]
These two sets of warnings appear to be pretty noisy in my build testing... Is there any way to shut them up? Perhaps it is good for users to see these limitations but it basically makes the build output useless. If there were switches, then they could be disabled in the default case with a Kconfig option to turn them on if the user is concerned with seeing which parts of their code are not instrumented. I could see developers wanting to run this for writing tests and they might not care about this as much as someone else might. I did leave LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS at its default value. Perhaps there is a more reasonable default that would result in less noisy build output but not run afoul of potential memory usage concerns? I assume that mention means that memory usage may be a concern for the type of deployments this technology would commonly be used with?
Link: https://en.wikipedia.org/wiki/Modified_condition%2Fdecision_coverage [1] Link: https://digital-library.theiet.org/content/journals/10.1049/sej.1994.0025 [2] Link: https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798 [3] Link: https://clang.llvm.org/docs/SourceBasedCodeCoverage.html#mc-dc-instrumentation [4]
Thank you for using this link format :)
Signed-off-by: Wentao Zhang <redacted> Reviewed-by: Chuck Wolber <redacted> Tested-by: Chuck Wolber <redacted>
From an actual code perspective, this looks good to me. Reviewed-by: Nathan Chancellor <nathan@kernel.org>
quoted hunk ↗ jump to hunk
diff --git a/Makefile b/Makefile index 51498134c..1185b38d6 100644 --- a/Makefile +++ b/Makefile@@ -740,6 +740,12 @@ all: vmlinux CFLAGS_LLVM_COV := -fprofile-instr-generate -fcoverage-mapping export CFLAGS_LLVM_COV +CFLAGS_LLVM_COV_MCDC := -fcoverage-mcdc +ifdef CONFIG_LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS +CFLAGS_LLVM_COV_MCDC += -Xclang -fmcdc-max-conditions=$(CONFIG_LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS)
Why is -Xclang needed here? Is this not a full frontend flag?
+endif +export CFLAGS_LLVM_COV_MCDC + CFLAGS_GCOV := -fprofile-arcs -ftest-coverage ifdef CONFIG_CC_IS_GCC CFLAGS_GCOV += -fno-tree-loop-im