Re: [PATCH v2 3/6] treewide: Replace memcpy(..., current->comm) with strscpy()
From: Steven Rostedt <rostedt@goodmis.org>
Date: 2026-05-26 23:06:05
Also in:
linux-mm, lkml
On Sun, 24 May 2026 19:38:53 -0300 André Almeida [off-list ref] wrote:
In order to increase the size of current->comm[] and to avoid breaking any existing code, replace memcpy() with strscpy(). The later function makes sure that the copy is NUL terminated. This is crucial given that the source buffer might be larger than the destination buffer and could truncate the NUL character out of it. Signed-off-by: André Almeida <andrealmeid@igalia.com> --- Changes from v2: - New patch, dropped strtostr() from last version --- include/linux/coredump.h | 2 +- include/linux/tracepoint.h | 4 ++-- include/trace/events/block.h | 10 +++++----- include/trace/events/coredump.h | 2 +- include/trace/events/f2fs.h | 4 ++-- include/trace/events/oom.h | 2 +- include/trace/events/osnoise.h | 2 +- include/trace/events/sched.h | 10 +++++----- include/trace/events/signal.h | 2 +- include/trace/events/task.h | 4 ++-- kernel/printk/nbcon.c | 2 +- kernel/printk/printk.c | 2 +- 12 files changed, 23 insertions(+), 23 deletions(-)
So I was curious to what impact this would have on tracing. I decided to
run the following:
perf stat -r 100 ./hackbench 50
To see how it affects things. Hackbench is a bit of a microbenchmark but it
stresses the scheduler and thus, scheduler trace events.
I first ran the above and put the output into "stat.baseline", then I enabled
all scheduler trace events:
trace-cmd start -e sched
and ran it again and put the output into "stat.before".
I applied the patch and ran it again before enabling tracing (just to see
the variance) and put that into "stat.baseline2". I then enabled tracing
and ran it again and put the output into "stat.after".
Here's the results:
stat.baseline:
Performance counter stats for '/work/c/hackbench 50' (100 runs):
53,165 context-switches # 11002.2 cs/sec cs_per_second ( +- 1.33% )
8,010 cpu-migrations # 1657.6 migrations/sec migrations_per_second ( +- 0.90% )
53,936 page-faults # 11161.7 faults/sec page_faults_per_second ( +- 0.50% )
4,832.24 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.12% )
18,787,710 branch-misses # 1.2 % branch_miss_rate ( +- 0.17% ) (38.88%)
1,452,653,496 branches # 300.6 M/sec branch_frequency ( +- 0.14% ) (61.55%)
15,607,564,080 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.15% ) (56.21%)
7,648,608,518 instructions # 0.5 instructions insn_per_cycle ( +- 0.11% ) (55.82%)
12,025,223,911 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.14% ) (56.26%)
0.808204663 +- 0.001059873 seconds time elapsed ( +- 0.13% )
stat.before:
Performance counter stats for '/work/c/hackbench 50' (100 runs):
54,722 context-switches # 11041.0 cs/sec cs_per_second ( +- 1.35% )
8,170 cpu-migrations # 1648.4 migrations/sec migrations_per_second ( +- 1.08% )
54,295 page-faults # 10954.8 faults/sec page_faults_per_second ( +- 0.53% )
4,956.27 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.14% )
19,304,657 branch-misses # 1.2 % branch_miss_rate ( +- 0.20% ) (37.27%)
1,497,794,368 branches # 302.2 M/sec branch_frequency ( +- 0.17% ) (60.74%)
16,037,658,236 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.16% ) (57.72%)
7,875,024,533 instructions # 0.5 instructions insn_per_cycle ( +- 0.13% ) (57.83%)
12,344,722,147 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.17% ) (55.77%)
0.827636161 +- 0.001027531 seconds time elapsed ( +- 0.12% )
stat.baseline2:
Performance counter stats for '/work/c/hackbench 50' (100 runs):
52,590 context-switches # 10837.7 cs/sec cs_per_second ( +- 1.18% )
7,958 cpu-migrations # 1640.0 migrations/sec migrations_per_second ( +- 0.99% )
53,819 page-faults # 11090.9 faults/sec page_faults_per_second ( +- 0.48% )
4,852.52 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.11% )
18,933,395 branch-misses # 1.2 % branch_miss_rate ( +- 0.18% ) (37.13%)
1,451,361,950 branches # 299.1 M/sec branch_frequency ( +- 0.13% ) (60.09%)
15,683,586,735 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.13% ) (56.05%)
7,628,894,710 instructions # 0.5 instructions insn_per_cycle ( +- 0.10% ) (57.22%)
12,063,750,082 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.14% ) (57.11%)
0.811536383 +- 0.001337259 seconds time elapsed ( +- 0.16% )
stat.after:
Performance counter stats for '/work/c/hackbench 50' (100 runs):
53,799 context-switches # 10743.3 cs/sec cs_per_second ( +- 1.35% )
8,095 cpu-migrations # 1616.5 migrations/sec migrations_per_second ( +- 0.86% )
54,330 page-faults # 10849.4 faults/sec page_faults_per_second ( +- 0.55% )
5,007.67 msec task-clock # 6.0 CPUs CPUs_utilized ( +- 0.13% )
19,444,339 branch-misses # 1.2 % branch_miss_rate ( +- 0.21% ) (38.04%)
1,504,382,421 branches # 300.4 M/sec branch_frequency ( +- 0.17% ) (60.42%)
16,225,153,060 cpu-cycles # 3.2 GHz cycles_frequency ( +- 0.16% ) (56.19%)
7,889,645,005 instructions # 0.5 instructions insn_per_cycle ( +- 0.16% ) (56.30%)
12,488,115,947 stalled-cycles-frontend # 0.77 frontend_cycles_idle ( +- 0.16% ) (55.55%)
0.835123855 +- 0.001015781 seconds time elapsed ( +- 0.12% )
Looking at the difference between cpu-cycles of baseline and baseline2, we have:
15,607,564,080 vs 15,683,586,735 where it went up by 0.4% (in the noise).
But when enabling tracing, we have between before and after:
16,037,658,236 vs 16,225,153,060 which is 1.1%. May be low but not insignificant.
Where tracing enabled slowed the code down by 2.7% (16,037,658,236 vs 15,607,564,080)
having another 1% is quite an impact!
As tracing now slows it down by 3.9% which is a significant increase from 2.7%
I really rather keep memcpy() here.
-- Steve