Thread (47 messages) 47 messages, 9 authors, 2019-03-05

Re: [PATCH bpf-next v2 1/7] bpf: implement lookup-free direct value access

From: Daniel Borkmann <daniel@iogearbox.net>
Date: 2019-03-01 19:51:17
Also in: bpf

On 03/01/2019 06:18 PM, Yonghong Song wrote:
On 2/28/19 3:18 PM, Daniel Borkmann wrote:
quoted
This generic extension to BPF maps allows for directly loading an
address residing inside a BPF map value as a single BPF ldimm64
instruction.

The idea is similar to what BPF_PSEUDO_MAP_FD does today, which
is a special src_reg flag for ldimm64 instruction that indicates
that inside the first part of the double insns's imm field is a
file descriptor which the verifier then replaces as a full 64bit
address of the map into both imm parts.

For the newly added BPF_PSEUDO_MAP_VALUE src_reg flag, the idea
is similar: the first part of the double insns's imm field is
again a file descriptor corresponding to the map, and the second
part of the imm field is an offset. The verifier will then replace
both imm parts with an address that points into the BPF map value
for maps that support this operation. BPF_PSEUDO_MAP_VALUE is a
distinct flag as otherwise with BPF_PSEUDO_MAP_FD we could not
differ offset 0 between load of map pointer versus load of map's
value at offset 0.

This allows for efficiently retrieving an address to a map value
memory area without having to issue a helper call which needs to
prepare registers according to calling convention, etc, without
needing the extra NULL test, and without having to add the offset
in an additional instruction to the value base pointer.

The verifier then treats the destination register as PTR_TO_MAP_VALUE
with constant reg->off from the user passed offset from the second
imm field, and guarantees that this is within bounds of the map
value. Any subsequent operations are normally treated as typical
map value handling without anything else needed for verification.

The two map operations for direct value access have been added to
array map for now. In future other types could be supported as
well depending on the use case. The main use case for this commit
is to allow for BPF loader support for global variables that
reside in .data/.rodata/.bss sections such that we can directly
load the address of them with minimal additional infrastructure
required. Loader support has been added in subsequent commits for
libbpf library.
The patch version #1 provides a way to replace the load with
immediate (presumably read-only data). This will be good for
the use case like below:

    if (static_variable_kernel_version == V1) {
        /* code here will work for kernel V1 */
        ... access helpers available for V1 ...
    } else if (static_variable_kernel_version == V2) {
        /* code here will work for kernel V2 */
        ... access helpers available for V2 ...
    }

The approach here did not replace the map value access with values from 
e.g., readonly section for which libbpf could provide an interface to 
fill in data from user.

This may require a little more analysis, e.g.,
    ptr = ld_imm64 from a readonly section
    ...
    *(u32 *)ptr;
    *(u64 *)(ptr + 8);
    ...

Do you think we could do this in kernel verifier or we should
push the whole readonly stuff into user space?
And in your case the static_variable_kernel_version would be determined
at runtime, for example, where you then would want to eliminate all the
other branches, right? Meaning, you'd need a way to turn this into a imm
load such that verifier will detect these dead branches and patch them
out, which it should already be able to do. How would you mark these
special vars like static_variable_kernel_version such that they have
special treatment from the rest, some sort of builtin? Potentially one
could get away with doing this from loader side if it's simple enough,
though one thing that would be good to avoid is to duplicate all the
complex branch fixup logic etc that we have in kernel already. Are you
thinking to mark these via BTF in some way such that loader does inline
replacement?

Thanks,
Daniel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help