Thread (7 messages) 7 messages, 4 authors, 2025-01-14

Re: [PATCH] kcov: add unique cover, edge, and cmp modes

From: Joey Jiao <hidden>
Date: 2025-01-14 05:57:42
Also in: linux-doc, linux-mm, lkml, workflows

On Fri, Jan 10, 2025 at 01:16:27PM +0100, Dmitry Vyukov wrote:
On Fri, 10 Jan 2025 at 10:23, Marco Elver [off-list ref] wrote:
quoted
quoted
From: "Jiao, Joey" <redacted>

The current design of KCOV risks frequent buffer overflows. To mitigate
this, new modes are introduced: KCOV_TRACE_UNIQ_PC, KCOV_TRACE_UNIQ_EDGE,
and KCOV_TRACE_UNIQ_CMP. These modes allow for the recording of unique
PCs, edges, and comparison operands (CMP).
There ought to be a cover letter explaining the motivation for this,
and explaining why the new modes would help. Ultimately, what are you
using KCOV for where you encountered this problem?
quoted
Key changes include:
- KCOV_TRACE_UNIQ_[PC|EDGE] can be used together to replace KCOV_TRACE_PC.
- KCOV_TRACE_UNIQ_CMP can be used to replace KCOV_TRACE_CMP mode.
- Introduction of hashmaps to store unique coverage data.
- Pre-allocated entries in kcov_map_init during KCOV_INIT_TRACE to avoid
  performance issues with kmalloc.
- New structs and functions for managing memory and unique coverage data.
- Example program demonstrating the usage of the new modes.
This should be a patch series, carefully splitting each change into a
separate patch.
https://docs.kernel.org/process/submitting-patches.html#split-changes
quoted
With the new hashmap and pre-alloced memory pool added, cover size can't
be set to higher value like 1MB in KCOV_TRACE_PC or KCOV_TRACE_CMP modes
in 2GB device with 8 procs, otherwise it causes frequent oom.

For KCOV_TRACE_UNIQ_[PC|EDGE|CMP] modes, smaller cover size like 8KB can
be used.

Signed-off-by: Jiao, Joey <redacted>
As-is it's hard to review, and the motivation is unclear. A lot of
code was moved and changed, and reviewers need to understand why that
was done besides your brief explanation above.

Generally, KCOV has very tricky constraints, due to being callable
from any context, including NMI. This means adding new dependencies
need to be carefully reviewed. For one, we can see this in genalloc's
header:
quoted
* The lockless operation only works if there is enough memory
* available.  If new memory is added to the pool a lock has to be
* still taken.  So any user relying on locklessness has to ensure
* that sufficient memory is preallocated.
*
* The basic atomic operation of this allocator is cmpxchg on long.
* On architectures that don't have NMI-safe cmpxchg implementation,
* the allocator can NOT be used in NMI handler.  So code uses the
* allocator in NMI handler should depend on
* CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG.
And you are calling gen_pool_alloc() from __sanitizer_cov_trace_pc.
Which means this implementation is likely broken on
!CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG architectures (do we have
architectures like that, that support KCOV?).

There are probably other sharp corners due to the contexts KCOV can
run in, but would simply ask you to carefully reason about why each
new dependency is safe.
I am also concerned about the performance effect. Does it add a stack
frame to __sanitizer_cov_trace_pc()? Please show disassm of the
function before/after.
# Before the patch
ffffffff8195df30 __sanitizer_cov_trace_pc:
ffffffff8195df30: f3 0f 1e fa          	endbr64
ffffffff8195df34: 48 8b 04 24          	movq	(%rsp), %rax
ffffffff8195df38: 65 48 8b 0c 25 00 d6 03 00   	movq	%gs:251392, %rcx
ffffffff8195df41: 65 8b 15 c0 f6 6d 7e 	movl	%gs:2121135808(%rip), %edx
ffffffff8195df48: 81 e2 00 01 ff 00    	andl	$16711936, %edx
ffffffff8195df4e: 74 11                	je	17 
<__sanitizer_cov_trace_pc+0x31>
ffffffff8195df50: 81 fa 00 01 00 00    	cmpl	$256, %edx
ffffffff8195df56: 75 35                	jne	53 
<__sanitizer_cov_trace_pc+0x5d>
ffffffff8195df58: 83 b9 1c 16 00 00 00 	cmpl	$0, 5660(%rcx)
ffffffff8195df5f: 74 2c                	je	44 
<__sanitizer_cov_trace_pc+0x5d>
ffffffff8195df61: 8b 91 f8 15 00 00    	movl	5624(%rcx), %edx
ffffffff8195df67: 83 fa 02             	cmpl	$2, %edx
ffffffff8195df6a: 75 21                	jne	33 
<__sanitizer_cov_trace_pc+0x5d>
ffffffff8195df6c: 48 8b 91 00 16 00 00 	movq	5632(%rcx), %rdx
ffffffff8195df73: 48 8b 32             	movq	(%rdx), %rsi
ffffffff8195df76: 48 8d 7e 01          	leaq	1(%rsi), %rdi
ffffffff8195df7a: 8b 89 fc 15 00 00    	movl	5628(%rcx), %ecx
ffffffff8195df80: 48 39 cf             	cmpq	%rcx, %rdi
ffffffff8195df83: 73 08                	jae	8 
<__sanitizer_cov_trace_pc+0x5d>
ffffffff8195df85: 48 89 3a             	movq	%rdi, (%rdx)
ffffffff8195df88: 48 89 44 f2 08       	movq	%rax, 8(%rdx,%rsi,8)
ffffffff8195df8d: 2e e9 cd 3d 8b 09    	jmp	160120269 <__x86_return_thunk>
ffffffff8195df93: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
ffffffff8195df9d: 0f 1f 00             	nopl	(%rax)

# After the patch

vmlinux:	file format ELF64-x86-64


Disassembly of section .text:

ffffffff8195df30 __sanitizer_cov_trace_pc:
ffffffff8195df30: f3 0f 1e fa          	endbr64
ffffffff8195df34: 41 57                	pushq	%r15
ffffffff8195df36: 41 56                	pushq	%r14
ffffffff8195df38: 41 54                	pushq	%r12
ffffffff8195df3a: 53                   	pushq	%rbx
ffffffff8195df3b: 48 8b 5c 24 20       	movq	32(%rsp), %rbx
ffffffff8195df40: 65 4c 8b 34 25 00 d6 03 00   	movq	%gs:251392, %r14
ffffffff8195df49: 65 8b 05 b8 f6 6d 7e 	movl	%gs:2121135800(%rip), %eax
ffffffff8195df50: 25 00 01 ff 00       	andl	$16711936, %eax
ffffffff8195df55: 74 19                	je	25 
<__sanitizer_cov_trace_pc+0x40>
ffffffff8195df57: 3d 00 01 00 00       	cmpl	$256, %eax
ffffffff8195df5c: 0f 85 54 01 00 00    	jne	340 
<__sanitizer_cov_trace_pc+0x186>
ffffffff8195df62: 41 83 be 1c 16 00 00 00      	cmpl	$0, 5660(%r14)
ffffffff8195df6a: 0f 84 46 01 00 00    	je	326 
<__sanitizer_cov_trace_pc+0x186>
ffffffff8195df70: 41 8b 86 f8 15 00 00 	movl	5624(%r14), %eax
ffffffff8195df77: a9 12 00 00 00       	testl	$18, %eax
ffffffff8195df7c: 0f 84 34 01 00 00    	je	308 
<__sanitizer_cov_trace_pc+0x186>
ffffffff8195df82: 41 83 be f8 15 00 00 02      	cmpl	$2, 5624(%r14)
ffffffff8195df8a: 75 25                	jne	37 
<__sanitizer_cov_trace_pc+0x81>
ffffffff8195df8c: 49 8b 86 00 16 00 00 	movq	5632(%r14), %rax
ffffffff8195df93: 48 8b 08             	movq	(%rax), %rcx
ffffffff8195df96: 48 ff c1             	incq	%rcx
ffffffff8195df99: 41 8b 96 fc 15 00 00 	movl	5628(%r14), %edx
ffffffff8195dfa0: 48 39 d1             	cmpq	%rdx, %rcx
ffffffff8195dfa3: 0f 83 0d 01 00 00    	jae	269 
<__sanitizer_cov_trace_pc+0x186>
ffffffff8195dfa9: 48 89 08             	movq	%rcx, (%rax)
ffffffff8195dfac: e9 fe 00 00 00       	jmp	254 
<__sanitizer_cov_trace_pc+0x17f>
ffffffff8195dfb1: 48 89 d8             	movq	%rbx, %rax
ffffffff8195dfb4: 48 c1 e8 20          	shrq	$32, %rax
ffffffff8195dfb8: 49 8b 8e 08 16 00 00 	movq	5640(%r14), %rcx
ffffffff8195dfbf: 4c 8b 79 58          	movq	88(%rcx), %r15
ffffffff8195dfc3: 05 f7 be ad de       	addl	$3735928567, %eax
ffffffff8195dfc8: 8d 93 f7 be ad de    	leal	-559038729(%rbx), %edx
ffffffff8195dfce: 89 c1                	movl	%eax, %ecx
ffffffff8195dfd0: 81 f1 f7 be ad de    	xorl	$3735928567, %ecx
ffffffff8195dfd6: 89 c6                	movl	%eax, %esi
ffffffff8195dfd8: c1 c6 0e             	roll	$14, %esi
ffffffff8195dfdb: 29 f1                	subl	%esi, %ecx
ffffffff8195dfdd: 31 ca                	xorl	%ecx, %edx
ffffffff8195dfdf: 89 ce                	movl	%ecx, %esi
ffffffff8195dfe1: c1 c6 0b             	roll	$11, %esi
ffffffff8195dfe4: 29 f2                	subl	%esi, %edx
ffffffff8195dfe6: 31 d0                	xorl	%edx, %eax
ffffffff8195dfe8: 89 d6                	movl	%edx, %esi
ffffffff8195dfea: c1 c6 19             	roll	$25, %esi
ffffffff8195dfed: 29 f0                	subl	%esi, %eax
ffffffff8195dfef: 89 c6                	movl	%eax, %esi
ffffffff8195dff1: c1 c6 10             	roll	$16, %esi
ffffffff8195dff4: 31 c1                	xorl	%eax, %ecx
ffffffff8195dff6: 29 f1                	subl	%esi, %ecx
ffffffff8195dff8: 31 ca                	xorl	%ecx, %edx
ffffffff8195dffa: 89 ce                	movl	%ecx, %esi
ffffffff8195dffc: c1 c6 04             	roll	$4, %esi
ffffffff8195dfff: 29 f2                	subl	%esi, %edx
ffffffff8195e001: 31 d0                	xorl	%edx, %eax
ffffffff8195e003: c1 c2 0e             	roll	$14, %edx
ffffffff8195e006: 29 d0                	subl	%edx, %eax
ffffffff8195e008: 89 c2                	movl	%eax, %edx
ffffffff8195e00a: c1 c2 18             	roll	$24, %edx
ffffffff8195e00d: 31 c8                	xorl	%ecx, %eax
ffffffff8195e00f: 29 d0                	subl	%edx, %eax
ffffffff8195e011: 44 69 e0 47 86 c8 61 	imull	$1640531527, %eax, %r12d
ffffffff8195e018: 41 c1 ec 11          	shrl	$17, %r12d
ffffffff8195e01c: 4b 8b 04 e7          	movq	(%r15,%r12,8), %rax
ffffffff8195e020: 48 85 c0             	testq	%rax, %rax
ffffffff8195e023: 74 18                	je	24 
<__sanitizer_cov_trace_pc+0x10d>
ffffffff8195e025: 48 83 c0 f8          	addq	$-8, %rax
ffffffff8195e029: 74 12                	je	18 
<__sanitizer_cov_trace_pc+0x10d>
ffffffff8195e02b: 48 39 18             	cmpq	%rbx, (%rax)
ffffffff8195e02e: 0f 84 82 00 00 00    	je	130 
<__sanitizer_cov_trace_pc+0x186>
ffffffff8195e034: 48 8b 40 08          	movq	8(%rax), %rax
ffffffff8195e038: 48 85 c0             	testq	%rax, %rax
ffffffff8195e03b: 75 e8                	jne	-24 
<__sanitizer_cov_trace_pc+0xf5>
ffffffff8195e03d: 49 8b bf 00 00 04 00 	movq	262144(%r15), %rdi
ffffffff8195e044: 48 8b 57 58          	movq	88(%rdi), %rdx
ffffffff8195e048: 48 8b 4f 60          	movq	96(%rdi), %rcx
ffffffff8195e04c: be 20 00 00 00       	movl	$32, %esi
ffffffff8195e051: 45 31 c0             	xorl	%r8d, %r8d
ffffffff8195e054: e8 47 b4 f0 02       	callq	49329223 
<gen_pool_alloc_algo_owner>
ffffffff8195e059: 48 85 c0             	testq	%rax, %rax
ffffffff8195e05c: 74 58                	je	88 
<__sanitizer_cov_trace_pc+0x186>
ffffffff8195e05e: 4b 8d 14 e7          	leaq	(%r15,%r12,8), %rdx
ffffffff8195e062: 48 89 c6             	movq	%rax, %rsi
ffffffff8195e065: 48 89 18             	movq	%rbx, (%rax)
ffffffff8195e068: 48 83 c0 08          	addq	$8, %rax
ffffffff8195e06c: 48 c7 46 08 00 00 00 00      	movq	$0, 8(%rsi)
ffffffff8195e074: 48 c7 46 10 00 00 00 00      	movq	$0, 16(%rsi)
ffffffff8195e07c: 48 8b 0a             	movq	(%rdx), %rcx
ffffffff8195e07f: 48 89 4e 08          	movq	%rcx, 8(%rsi)
ffffffff8195e083: 48 89 56 10          	movq	%rdx, 16(%rsi)
ffffffff8195e087: 48 89 02             	movq	%rax, (%rdx)
ffffffff8195e08a: 48 85 c9             	testq	%rcx, %rcx
ffffffff8195e08d: 74 04                	je	4 
<__sanitizer_cov_trace_pc+0x163>
ffffffff8195e08f: 48 89 41 08          	movq	%rax, 8(%rcx)
ffffffff8195e093: 49 8b 86 00 16 00 00 	movq	5632(%r14), %rax
ffffffff8195e09a: 48 8b 08             	movq	(%rax), %rcx
ffffffff8195e09d: 48 ff c1             	incq	%rcx
ffffffff8195e0a0: 41 8b 96 fc 15 00 00 	movl	5628(%r14), %edx
ffffffff8195e0a7: 48 39 d1             	cmpq	%rdx, %rcx
ffffffff8195e0aa: 73 0a                	jae	10 
<__sanitizer_cov_trace_pc+0x186>
ffffffff8195e0ac: 48 89 08             	movq	%rcx, (%rax)
ffffffff8195e0af: 48 8d 04 c8          	leaq	(%rax,%rcx,8), %rax
ffffffff8195e0b3: 48 89 18             	movq	%rbx, (%rax)
ffffffff8195e0b6: 5b                   	popq	%rbx
ffffffff8195e0b7: 41 5c                	popq	%r12
ffffffff8195e0b9: 41 5e                	popq	%r14
ffffffff8195e0bb: 41 5f                	popq	%r15
ffffffff8195e0bd: 2e e9 9d 3c 8b 09    	jmp	160119965 <__x86_return_thunk>
ffffffff8195e0c3: 66 2e 0f 1f 84 00 00 00 00 00	nopw	%cs:(%rax,%rax)
ffffffff8195e0cd: 0f 1f 00             	nopl	(%rax)


So frame to gen_pool_alloc_algo_owner has been added, and instr needs to be 
disabled for it.
Also, I have concerns about interrupts and reentrancy. We are still
getting some reentrant calls from interrupts (not all of them are
filtered by in_task() check). I am afraid these complex hashmaps will
corrupt.
Need more investigate and advice on better way to have uniq info stored.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help