Thread (6 messages) 6 messages, 5 authors, 2022-05-27

Re: [PATCH] procfs: add syscall statistics

From: Greg KH <gregkh@linuxfoundation.org>
Date: 2022-05-27 12:13:25
Also in: linux-arm-kernel, linux-doc, linux-fsdevel, linux-s390, lkml

On Fri, May 27, 2022 at 07:09:59PM +0800, Zhang Yuchen wrote:
Add /proc/syscalls to display percpu syscall count.

We need a less resource-intensive way to count syscall per cpu
for system problem location.
Why?

How is this less resource intensive than perf?
There is a similar utility syscount in the BCC project, but syscount
has a high performance cost.
What is that cost?
The following is a comparison on the same machine, using UnixBench
System Call Overhead:

    ┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
    ┃ Change        ┃ Unixbench Score ┃ Loss   ┃
    ┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
    │ no change     │ 1072.6          │ ---    │
    │ syscall count │ 982.5           │ 8.40%  │
    │ bpf syscount  │ 614.2           │ 42.74% │
    └───────────────┴─────────────────┴────────┘
Again, what about perf?
UnixBench System Call Use sys_gettid to test, this system call only reads
one variable, so the performance penalty seems large. When tested with
fork, the test scores were almost the same.

So the conclusion is that it does not have a significant impact on system
call performance.
8% is huge for a system-wide decrease in performance.  Who would ever
use this?
quoted hunk ↗ jump to hunk
This function depends on CONFIG_FTRACE_SYSCALLS because the system call
number is stored in syscall_metadata.

Signed-off-by: Zhang Yuchen <redacted>
---
 Documentation/filesystems/proc.rst       | 28 +++++++++
 arch/arm64/include/asm/syscall_wrapper.h |  2 +-
 arch/s390/include/asm/syscall_wrapper.h  |  4 +-
 arch/x86/include/asm/syscall_wrapper.h   |  2 +-
 fs/proc/Kconfig                          |  7 +++
 fs/proc/Makefile                         |  1 +
 fs/proc/syscall.c                        | 79 ++++++++++++++++++++++++
 include/linux/syscalls.h                 | 51 +++++++++++++--
 8 files changed, 165 insertions(+), 9 deletions(-)
 create mode 100644 fs/proc/syscall.c
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 1bc91fb8c321..80394a98a192 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -686,6 +686,7 @@ files are there, and which are missing.
  fs 	      File system parameters, currently nfs/exports	(2.4)
  ide          Directory containing info about the IDE subsystem
  interrupts   Interrupt usage
+ syscalls     Syscall count for each cpu
  iomem 	      Memory map					(2.4)
  ioports      I/O port usage
  irq 	      Masks for irq to cpu affinity			(2.4)(smp?)
@@ -1225,6 +1226,33 @@ Provides counts of softirq handlers serviced since boot time, for each CPU.
     HRTIMER:         0          0          0          0
 	RCU:      1678       1769       2178       2250
 
+syscalls
+~~~~~~~~
+
+Provides counts of syscall since boot time, for each cpu.
+
+::
+
+    > cat /proc/syscalls
+               CPU0       CPU1       CPU2       CPU3
+      0:       3743       3099       3770       3242   sys_read
+      1:        222        559        822        522   sys_write
+      2:          0          0          0          0   sys_open
+      3:       6481      18754      12077       7349   sys_close
+      4:      11362      11120      11343      10665   sys_newstat
+      5:       5224      13880       8578       5971   sys_newfstat
+      6:       1228       1269       1459       1508   sys_newlstat
+      7:         90         43         64         67   sys_poll
+      8:       1635       1000       2071       1161   sys_lseek
+    .... omit the middle line ....
+    441:          0          0          0          0   sys_epoll_pwait2
+    442:          0          0          0          0   sys_mount_setattr
+    443:          0          0          0          0   sys_quotactl_fd
+    447:          0          0          0          0   sys_memfd_secret
+    448:          0          0          0          0   sys_process_mrelease
+    449:          0          0          0          0   sys_futex_waitv
+    450:          0          0          0          0   sys_set_mempolicy_home_node
So for systems with large numbers of CPUs, these are huge lines?  Have
you tested this on large systems?  If so, how big?

thanks,

greg k-h
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help