Thread (25 messages) 25 messages, 8 authors, 2020-06-04

Re: [PATCH v3 0/7] Statsfs: a new ram-based file system for Linux kernel statistics

From: Paolo Bonzini <pbonzini@redhat.com>
Date: 2020-05-27 21:45:02
Also in: kvm, linux-arm-kernel, linux-doc, linux-fsdevel, linux-mips, linux-s390, lkml, netdev

On 27/05/20 23:27, Jakub Kicinski wrote:
On Wed, 27 May 2020 23:07:53 +0200 Paolo Bonzini wrote:
quoted
quoted
Again, I have little KVM knowledge, but BPF also uses a fd-based API,
and carries stats over the same syscall interface.  
Can BPF stats (for BPF scripts created by whatever process is running in
the system) be collected by an external daemon that does not have access
to the file descriptor?  For KVM it's of secondary importance to gather
stats in the program; it can be nice to have and we are thinking of a
way to export the stats over the fd-based API, but it's less useful than
system-wide monitoring.  Perhaps this is a difference between the two.
Yes, check out bpftool prog list (bpftool code is under tools/bpf/ in
the kernel tree). BPF statistics are under a static key, so you may not
see any on your system. My system shows e.g.:

81: kprobe  name abc  tag cefaa9376bdaae75  gpl run_time_ns 80941 run_cnt 152
	loaded_at 2020-05-26T13:00:24-0700  uid 0
	xlated 512B  jited 307B  memlock 4096B  map_ids 66,64
	btf_id 16

In this example run_time_ns and run_cnt are stats.

The first number on the left is the program ID. BPF has an IDA, and
each object gets an integer id. So admin (or CAP_BPF, I think) can
iterate over the ids and open fds to objects of interest.
Got it, thanks.  But then "I'd hope that whatever daemon collects [BPF]
stats doesn't run as root". :)
quoted
Another case where stats and configuration are separate is CPUs, where
CPU enumeration is done in sysfs but statistics are exposed in various
procfs files such as /proc/interrupts and /proc/stats.
True, but I'm guessing everyone is just okay living with the legacy
procfs format there. Otherwise I'd guess the stats would had been added
to sysfs. I'd be curious to hear the full story there.
Yeah, it's a chicken-and-egg problem in that there's no good place in
sysfs to put statistics right now, which is part of what this filesystem
is trying to solve (the other part is the API).

You can read more about Google's usecase at
http://lkml.iu.edu/hypermail/linux/kernel/2005.0/08056.html, it does
include both network and interrupt stats and it's something that they've
been using in production for quite some time.  We'd like the statsfs API
to be the basis for including something akin to that in Linux.

To be honest, it's unlikely that Emanuele (who has just finished his
internship at Red Hat) and I will pursue the networking stats further
than the demo patch at the end of this series. However, we're trying to
make sure that the API is at least ready for that, and to probe whether
any developers from other subsystems would be interested in using
statsfs.  So thanks for bringing your point of view!

Thanks,

Paolo
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help