Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type... | netdev

[RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-20
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-21
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-21
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-21
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Steven Rostedt <rostedt@goodmis.org> · 2019-05-21
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-21
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Steven Rostedt <rostedt@goodmis.org> · 2019-05-21
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-22
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-22
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-23
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-23
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Steven Rostedt <rostedt@goodmis.org> · 2019-05-23
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-24
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Steven Rostedt <rostedt@goodmis.org> · 2019-05-24
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-24
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Steven Rostedt <rostedt@goodmis.org> · 2019-05-24
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-24
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-24
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-24
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Steven Rostedt <rostedt@goodmis.org> · 2019-05-24
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-21
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-21
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-22
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-22
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-23
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-23
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-30
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Chris Mason <clm@fb.com> · 2019-05-31
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-06-06
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-06-18
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-06-18
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-06-18
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-06-18
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-06-18
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Peter Zijlstra <peterz@infradead.org> · 2019-05-22
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-22
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-22
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · David Miller <davem@davemloft.net> · 2019-05-22
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-23
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Peter Zijlstra <peterz@infradead.org> · 2019-05-24
[RFC PATCH 02/11] bpf: add BPF_PROG_TYPE_DTRACE · Kris Van Hees <hidden> · 2019-05-21
[RFC PATCH 03/11] bpf: export proto for bpf_perf_event_output helper · Kris Van Hees <hidden> · 2019-05-21
[RFC PATCH 07/11] bpf: implement writable buffers in contexts · Kris Van Hees <hidden> · 2019-05-21
[RFC PATCH 08/11] perf: add perf_output_begin_forward_in_page · Kris Van Hees <hidden> · 2019-05-21
[RFC PATCH 11/11] dtrace: make use of writable buffers in BPF · Kris Van Hees <hidden> · 2019-05-21
[RFC PATCH 01/11] bpf: context casting for tail call · Kris Van Hees <hidden> · 2019-05-21
[RFC PATCH 05/11] trace: update Kconfig and Makefile to include DTrace · Kris Van Hees <hidden> · 2019-05-21
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Kris Van Hees <hidden> · 2019-05-21
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Steven Rostedt <rostedt@goodmis.org> · 2019-05-21
Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use · Alexei Starovoitov <hidden> · 2019-05-21

Re: [RFC PATCH 00/11] bpf, trace, dtrace: DTrace BPF program type implementation and sample use

From: Kris Van Hees <hidden>
Date: 2019-05-24 05:27:21
Also in: bpf, lkml

On Thu, May 23, 2019 at 07:08:51PM -0700, Alexei Starovoitov wrote:

On Thu, May 23, 2019 at 09:57:37PM -0400, Steven Rostedt wrote:

quoted

On Thu, 23 May 2019 17:31:50 -0700
Alexei Starovoitov [off-list ref] wrote:

quoted

Now from what I'm reading, it seams that the Dtrace layer may be
abstracting out fields from the kernel. This is actually something I
have been thinking about to solve the "tracepoint abi" issue. There's
usually basic ideas that happen. An interrupt goes off, there's a
handler, etc. We could abstract that out that we trace when an
interrupt goes off and the handler happens, and record the vector
number, and/or what device it was for. We have tracepoints in the
kernel that do this, but they do depend a bit on the implementation.
Now, if we could get a layer that abstracts this information away from
the implementation, then I think that's a *good* thing.

I don't like this deferred irq idea at all.

What do you mean deferred?

that's how I interpreted your proposal: 
"interrupt goes off and the handler happens, and record the vector number"
It's not a good thing to tell about irq later.
Just like saying lets record perf counter event and report it later.

The abstraction I mentioned does not defer anything - it merely provides a way
for all probe events to be processed as a generic probe with a set of values
associated with it (e.g. syscall arguments for a syscall entry probe).  The
program that implements what needs to happen when that probe fires still does
whatever is necessary to collect information, and dump data in the output
buffers before execution continues.

I could trace entry into a syscall by using a syscall entry tracepoint or by
putting a kprobe on the syscall function itself.  I am usually interested in
whether the syscall was called, what the arguments were, and perhaps I need to
collect some other data related to it.  More often than not, both probes would
get the job done.  With an abstraction that hides the implementation details
of the probe mechanism itself, both cases are essentially the same.

quoted

Abstracting details from the users is _never_ a good idea.

Really? Most everything we do is to abstract details from the user. The
key is to make the abstraction more meaningful than the raw data.

quoted

A ton of people use bcc scripts and bpftrace because they want those details.
They need to know what kernel is doing to make better decisions.
Delaying irq record is the opposite.

I never said anything about delaying the record. Just getting the
information that is needed.

quoted

I wish that was totally true, but tracepoints *can* be an abi. I had
code reverted because powertop required one to be a specific
format. To this day, the wakeup event has a "success" field that
writes in a hardcoded "1", because there's tools that depend on it,
and they only work if there's a success field and the value is 1.

I really think that you should put powertop nightmares to rest.
That was long ago. The kernel is different now.

Is it?

quoted

Linus made it clear several times that it is ok to change _all_
tracepoints. Period. Some maintainers somehow still don't believe
that they can do it.

From what I remember him saying several times, is that you can change
all tracepoints, but if it breaks a tool that is useful, then that
change will get reverted. He will allow you to go and fix that tool and
bring back the change (which was the solution to powertop).

my interpretation is different.
We changed tracepoints. It broke scripts. People changed scripts.

In my world, the sequence is more like: tracepoints get changed, scripts
break, I fix the provider (abstraction), scripts work again.  Users really
appreciate that aspect because many of our users are not kernel experts.

quoted

Some tracepoints are used more than others and more people will
complain: "ohh I need to change my script" when that tracepoint
changes. But the kernel development is not going to be hampered by a
tracepoint. No matter how widespread its usage in scripts.

That's because we'll treat bpf (and Dtrace) scripts like modules (no
abi), at least we better. But if there's a tool that doesn't use the
script and reads the tracepoint directly via perf, then that's a
different story.

absolutely not.
tracepoint is a tracepoint. It can change regardless of what
and how is using it.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help