Thread (12 messages) 12 messages, 2 authors, 2021-01-22

Re: [PATCH v2 bpf-next 3/4] libbpf: BTF dumper support for typed data

From: Alan Maguire <hidden>
Date: 2021-01-22 19:35:53
Also in: bpf, linux-kselftest, lkml

On Thu, 21 Jan 2021, Andrii Nakryiko wrote:
On Wed, Jan 20, 2021 at 10:56 PM Andrii Nakryiko
[off-list ref] wrote:
quoted
On Sun, Jan 17, 2021 at 2:22 PM Alan Maguire [off-list ref] wrote:
quoted
Add a BTF dumper for typed data, so that the user can dump a typed
version of the data provided.

The API is

int btf_dump__emit_type_data(struct btf_dump *d, __u32 id,
                             const struct btf_dump_emit_type_data_opts *opts,
                             void *data);
Two more things I realized about this API overnight:

1. It's error-prone to specify only the pointer to data without
specifying the size. If user screws up and scecifies wrong type ID or
if BTF data is corrupted, then this API would start reading and
printing memory outside the bounds. I think it's much better to also
require user to specify the size and bail out with error if we reach
the end of the allowed memory area.
Yep, good point, especially given in the tracing context we will likely
only have a subset of the data (e.g. part of the 16k representing a
task_struct).  The way I was approaching this was to return -E2BIG
and append a "..." to the dumped data denoting the data provided
didn't cover the size needed to fully represent the type. The idea is
the structure is too big for the data provided, hence E2BIG, but maybe 
there's a more intuitive way to do this? See below for more...
2. This API would be more useful if it also returns the amount of
"consumed" bytes. That way users can do more flexible and powerful
pretty-printing of raw data. So on success we'll have >= 0 number of
bytes used for dumping given BTF type, or <0 on error. WDYT?
I like it! So 

1. if a user provides a too-big data object, we return the amount we used; and
2. if a user provides a too-small data object, we append "..." to the dump
  and return -E2BIG (or whatever error code).

However I wonder for case 2 if it'd be better to use a snprintf()-like 
semantic rather than an error code, returning the amount we would have 
used. That way we easily detect case 1 (size passed in > return value), 
case 2 (size passed in < return value), and errors can be treated separately.  
Feels to me that dealing with truncated data is going to be sufficiently 
frequent it might be good not to classify it as an error. Let me know if 
you think that makes sense.

I'm working on v3, and hope to have something early next week, but a quick 
reply to a question below...
quoted
quoted
...where the id is the BTF id of the data pointed to by the "void *"
argument; for example the BTF id of "struct sk_buff" for a
"struct skb *" data pointer.  Options supported are

 - a starting indent level (indent_lvl)
 - a set of boolean options to control dump display, similar to those
   used for BPF helper bpf_snprintf_btf().  Options are
        - compact : omit newlines and other indentation
        - noname: omit member names
        - zero: show zero-value members

Default output format is identical to that dumped by bpf_snprintf_btf(),
for example a "struct sk_buff" representation would look like this:

struct sk_buff){
 (union){
  (struct){
Curious, these explicit anonymous (union) and (struct), is that
preferred way for explicitness, or is it just because it makes
implementation simpler and thus was chosen? I.e., if the goal was to
mimic C-style data initialization, you'd just have plain .next = ...,
.prev = ..., .dev = ..., .dev_scratch = ..., all on the same level. So
just checking for myself.
The idea here is that we want to clarify if we're dealing with
an anonymous struct or union.  I wanted to have things work
like a C-style initializer as closely as possible, but I
realized it's not legit to initialize multiple values in a
union, and more importantly when we're trying to visually interpret
data, we really want to know if an anonymous container of data is
a structure (where all values represent different elements in the
structure) or a union (where we're seeing multiple interpretations of
the same value).

Thanks again for the detailed review!

Alan
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help