Thread (16 messages) 16 messages, 3 authors, 12d ago

Re: [PATCH v2 3/6] treewide: Replace memcpy(..., current->comm) with strscpy()

From: Steven Rostedt <rostedt@goodmis.org>
Date: 2026-05-26 23:06:05
Also in: linux-mm, lkml

On Sun, 24 May 2026 19:38:53 -0300
André Almeida [off-list ref] wrote:
In order to increase the size of current->comm[] and to avoid breaking any
existing code, replace memcpy() with strscpy(). The later function makes
sure that the copy is NUL terminated. This is crucial given that the
source buffer might be larger than the destination buffer and could
truncate the NUL character out of it.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
Changes from v2:
 - New patch, dropped strtostr() from last version
---
 include/linux/coredump.h        |  2 +-
 include/linux/tracepoint.h      |  4 ++--
 include/trace/events/block.h    | 10 +++++-----
 include/trace/events/coredump.h |  2 +-
 include/trace/events/f2fs.h     |  4 ++--
 include/trace/events/oom.h      |  2 +-
 include/trace/events/osnoise.h  |  2 +-
 include/trace/events/sched.h    | 10 +++++-----
 include/trace/events/signal.h   |  2 +-
 include/trace/events/task.h     |  4 ++--
 kernel/printk/nbcon.c           |  2 +-
 kernel/printk/printk.c          |  2 +-
 12 files changed, 23 insertions(+), 23 deletions(-)
So I was curious to what impact this would have on tracing. I decided to
run the following:

    perf stat -r 100 ./hackbench 50

To see how it affects things. Hackbench is a bit of a microbenchmark but it
stresses the scheduler and thus, scheduler trace events.

I first ran the above and put the output into "stat.baseline", then I enabled
all scheduler trace events:

   trace-cmd start -e sched

and ran it again and put the output into "stat.before".

I applied the patch and ran it again before enabling tracing (just to see
the variance) and put that into "stat.baseline2". I then enabled tracing
and ran it again and put the output into "stat.after".

Here's the results:

stat.baseline:

 Performance counter stats for '/work/c/hackbench 50' (100 runs):

            53,165      context-switches                 #  11002.2 cs/sec  cs_per_second       ( +-  1.33% )
             8,010      cpu-migrations                   #   1657.6 migrations/sec  migrations_per_second  ( +-  0.90% )
            53,936      page-faults                      #  11161.7 faults/sec  page_faults_per_second  ( +-  0.50% )
          4,832.24 msec task-clock                       #      6.0 CPUs  CPUs_utilized         ( +-  0.12% )
        18,787,710      branch-misses                    #      1.2 %  branch_miss_rate         ( +-  0.17% )  (38.88%)
     1,452,653,496      branches                         #    300.6 M/sec  branch_frequency     ( +-  0.14% )  (61.55%)
    15,607,564,080      cpu-cycles                       #      3.2 GHz  cycles_frequency       ( +-  0.15% )  (56.21%)
     7,648,608,518      instructions                     #      0.5 instructions  insn_per_cycle  ( +-  0.11% )  (55.82%)
    12,025,223,911      stalled-cycles-frontend          #     0.77 frontend_cycles_idle        ( +-  0.14% )  (56.26%)

       0.808204663 +- 0.001059873 seconds time elapsed  ( +-  0.13% )

stat.before:

 Performance counter stats for '/work/c/hackbench 50' (100 runs):

            54,722      context-switches                 #  11041.0 cs/sec  cs_per_second       ( +-  1.35% )
             8,170      cpu-migrations                   #   1648.4 migrations/sec  migrations_per_second  ( +-  1.08% )
            54,295      page-faults                      #  10954.8 faults/sec  page_faults_per_second  ( +-  0.53% )
          4,956.27 msec task-clock                       #      6.0 CPUs  CPUs_utilized         ( +-  0.14% )
        19,304,657      branch-misses                    #      1.2 %  branch_miss_rate         ( +-  0.20% )  (37.27%)
     1,497,794,368      branches                         #    302.2 M/sec  branch_frequency     ( +-  0.17% )  (60.74%)
    16,037,658,236      cpu-cycles                       #      3.2 GHz  cycles_frequency       ( +-  0.16% )  (57.72%)
     7,875,024,533      instructions                     #      0.5 instructions  insn_per_cycle  ( +-  0.13% )  (57.83%)
    12,344,722,147      stalled-cycles-frontend          #     0.77 frontend_cycles_idle        ( +-  0.17% )  (55.77%)

       0.827636161 +- 0.001027531 seconds time elapsed  ( +-  0.12% )


stat.baseline2:

 Performance counter stats for '/work/c/hackbench 50' (100 runs):

            52,590      context-switches                 #  10837.7 cs/sec  cs_per_second       ( +-  1.18% )
             7,958      cpu-migrations                   #   1640.0 migrations/sec  migrations_per_second  ( +-  0.99% )
            53,819      page-faults                      #  11090.9 faults/sec  page_faults_per_second  ( +-  0.48% )
          4,852.52 msec task-clock                       #      6.0 CPUs  CPUs_utilized         ( +-  0.11% )
        18,933,395      branch-misses                    #      1.2 %  branch_miss_rate         ( +-  0.18% )  (37.13%)
     1,451,361,950      branches                         #    299.1 M/sec  branch_frequency     ( +-  0.13% )  (60.09%)
    15,683,586,735      cpu-cycles                       #      3.2 GHz  cycles_frequency       ( +-  0.13% )  (56.05%)
     7,628,894,710      instructions                     #      0.5 instructions  insn_per_cycle  ( +-  0.10% )  (57.22%)
    12,063,750,082      stalled-cycles-frontend          #     0.77 frontend_cycles_idle        ( +-  0.14% )  (57.11%)

       0.811536383 +- 0.001337259 seconds time elapsed  ( +-  0.16% )

stat.after:

 Performance counter stats for '/work/c/hackbench 50' (100 runs):

            53,799      context-switches                 #  10743.3 cs/sec  cs_per_second       ( +-  1.35% )
             8,095      cpu-migrations                   #   1616.5 migrations/sec  migrations_per_second  ( +-  0.86% )
            54,330      page-faults                      #  10849.4 faults/sec  page_faults_per_second  ( +-  0.55% )
          5,007.67 msec task-clock                       #      6.0 CPUs  CPUs_utilized         ( +-  0.13% )
        19,444,339      branch-misses                    #      1.2 %  branch_miss_rate         ( +-  0.21% )  (38.04%)
     1,504,382,421      branches                         #    300.4 M/sec  branch_frequency     ( +-  0.17% )  (60.42%)
    16,225,153,060      cpu-cycles                       #      3.2 GHz  cycles_frequency       ( +-  0.16% )  (56.19%)
     7,889,645,005      instructions                     #      0.5 instructions  insn_per_cycle  ( +-  0.16% )  (56.30%)
    12,488,115,947      stalled-cycles-frontend          #     0.77 frontend_cycles_idle        ( +-  0.16% )  (55.55%)

       0.835123855 +- 0.001015781 seconds time elapsed  ( +-  0.12% )


Looking at the difference between cpu-cycles of baseline and baseline2, we have:

  15,607,564,080 vs 15,683,586,735 where it went up by 0.4% (in the noise).

But when enabling tracing, we have between before and after:

  16,037,658,236 vs 16,225,153,060 which is 1.1%. May be low but not insignificant.

Where tracing enabled slowed the code down by 2.7% (16,037,658,236 vs 15,607,564,080)
having another 1% is quite an impact!

As tracing now slows it down by 3.9% which is a significant increase from 2.7%

I really rather keep memcpy() here.

-- Steve
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help