Re: [PATCH bpf-next] bpf, capabilities: introduce CAP_BPF
From: Steven Rostedt <rostedt@goodmis.org>
Date: 2019-10-02 23:00:32
Also in:
bpf, linux-api, netdev
On Wed, 2 Oct 2019 17:18:21 +0000 Alexei Starovoitov [off-list ref] wrote:
quoted
quoted
It's an interesting idea, but I don't think it can work. Please see bpf_trace_printk implementation in kernel/trace/bpf_trace.c It's a lot more than string printing.Well, trace_printk() is just string printing. I was thinking that the bpf_trace_printk() could just use a vsnprintf() into a temporary buffer (like trace_printk() does), and then call the trace event to write it out.are you proposing to replicate get_trace_buf() functionality into bpf_trace_printk?
No, do you need bpf_trace_printk() to run in all contexts? trace_printk() does the get_trace_buf() dance so that it can be called without locks and from any context including NMIs.
So print into temp string buffer is done twice? I'm not excited about such hack. And what's the goal? so that trace_bpf_print(string_msg); can go through _run-time_ check whether that particular trace event was allowed in tracefs ?
No, just to use a standard event instead of hacking into trace_printk().
That's not how file system acls are typically designed. The permission check is at open(). Not at write(). If I understood you correctly you're proposing to check permissions at bpf program run-time which is no good. bpf_trace_printk() already has one small buffer for probe_kernel_read-ing an unknown string to pass into %s. That's not ftrace. That's core tracing. That aspect is covered by CAP_TRACING as well.
Then use that buffer.
quoted
quoted
quoted
The user could then just enable the trace event from the file system. I could also work on making instances work like /tmp does (with the sticky bit) in creation. That way people with write access to the instances directory, can make their own buffers that they can use (and others can't access).We tried instances in bcc in the past and eventually removed all the support. The overhead of instances is too high to be usable.What overhead? An ftrace instance should not have any more overhead than the root one does (it's the same code). Or are you talking about memory overhead?Yes. Memory overhead. Human users doing cat/echo into tracefs won't be creating many instances, so that's the only practical usage of them.
If it's a real event, it can go into any of the ftrace buffers (top level or instance), but it gives you the choice.
quoted
quoted
quoted
quoted
Both 'trace' and 'trace_pipe' have quirky side effects. Like opening 'trace' file will make all parallel trace_printk() to be ignored. While reading 'trace_pipe' file will clear it. The point that traditional 'read' and 'write' ACLs don't map as-is to tracefs, so I would be careful categorizing things into confidentiality vs integrity only based on access type.What exactly is the bpf_trace_printk() used for? I may have other ideas that can help.It's debugging of bpf programs. Same is what printk() is used for by kernel developers.How is it extracted? Just read from the trace or trace_pipe file?yep. Just like kernel devs look at dmesg when they sprinkle printk. btw, if you can fix 'trace' file issue that stops all trace_printk while 'trace' file is open that would be great. Some users have been bitten by this behavior. We even documented it.
The behavior is documented as well in the ftrace documentation. That's why we suggest the trace_pipe redirected into a file so that you don't lose data (unless the writer goes too fast). If you prefer a producer consumer where you lose newer events (like perf does), you can turn off overwrite mode, and it will drop events when the buffer is full (see options/overwrite). -- Steve