Re: [RFC v2 bpf-next 8/9] bpf: Provide helper to do lookups in kernel FIB table
From: David Ahern <hidden>
Date: 2018-05-15 03:46:15
Subsystem:
bpf [general] (safe dynamic programs and tools), the rest · Maintainers:
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Linus Torvalds
On 4/29/18 7:13 PM, David Ahern wrote:
The idea here is to fast pass packets that fit a supported profile and are to be forwarded. Everything else should continue up the stack as it has wider capabilities. The helper and XDP programs should make no assumptions on what the broader kernel and userspace might be monitoring or want to do with packets that can not be forwarded in the fast path. This is very similar to hardware forwarding when it punts packets to the CPU for control plane assistance.
Thinking about this some more and how to return more information to the bpf program about the FIB lookup. bpf_fib_lookup struct is 64-bytes. It can not be expanded without hurting performance. I could do another union on an input parameter and return flags indicating why the returned index is 0. Something like this:
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 360a1168c353..75591522444c 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h@@ -2314,6 +2314,12 @@ struct bpf_raw_tracepoint_args { #define BPF_FIB_LOOKUP_DIRECT BIT(0) #define BPF_FIB_LOOKUP_OUTPUT BIT(1) +#define BPF_FIB_LKUP_RET_NO_FWD BIT(0) /* pkt is not fwded */ +#define BPF_FIB_LKUP_RET_UNSUPP_LWT BIT(1) /* fwd requires unsupp
encap */
+#define BPF_FIB_LKUP_RET_NO_NHDEV BIT(2) /* nh device does not exist */
+#define BPF_FIB_LKUP_RET_NO_NEIGH BIT(3) /* no neigh entry for nh */
+#define BPF_FIB_LKUP_RET_FRAG_NEEDED BIT(4) /* pkt too big to fwd */
+
struct bpf_fib_lookup {
/* input */
__u8 family; /* network family, AF_INET, AF_INET6, AF_MPLS */@@ -2325,7 +2331,11 @@ struct bpf_fib_lookup { /* total length of packet from network header - used for MTU
check */
__u16 tot_len;
- __u32 ifindex; /* L3 device index for lookup */
+
+ union {
+ __u32 ifindex; /* in: L3 device index for lookup */
+ __u32 ret_flags; /* out: BPF_FIB_LOOKUP_RET flags */
+ }
union {
/* inputs to lookup */
Similarly for the fib result, it could be returned with a union on say
family:
union {
__u8 family; /* in: network family, AF_INET, AF_INET6, AF_MPLS */
__u8 rt_type; /* out: FIB lookup route type */
};
Then if the fib result is -EINVAL/-EHOSTUNREACH/-EACCES, rt_type is set
to RTN_BLACKHOLE/RTN_UNREACHABLE/RTN_PROHIBIT allowing the XDP program
to make an informed decision on dropping the packet.
To avoid performance hits on the forwarding path, these return values
would *only* set if the ifindex returned is 0.