Re: [PATCH bpf-next 0/7] Add XDP_ATTACH bind() flag to AF_XDP sockets

From: Jesper Dangaard Brouer <hidden>
Date: 2018-12-08 20:42:35

On Sat, 8 Dec 2018 19:43:38 +0100
Björn Töpel [off-list ref] wrote:

Den lör 8 dec. 2018 kl 15:52 skrev Jesper Dangaard Brouer [off-list ref]:

quoted

On Fri, 7 Dec 2018 15:01:55 +0100
Björn Töpel [off-list ref] wrote:

quoted

Den fre 7 dec. 2018 kl 14:42 skrev Jesper Dangaard Brouer [off-list ref]:

quoted

On Fri,  7 Dec 2018 12:44:24 +0100
Björn Töpel [off-list ref] wrote:

quoted

The rationale behind attach is performance and ease of use. Many XDP
socket users just need a simple way of creating/binding a socket and
receiving frames right away without loading an XDP program.

XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply
is a kernel provided XDP program that is installed to the netdev when
XDP_ATTACH is being passed as a bind() flag.

The builtin program is the simplest program possible to redirect a
frame to an attached socket. In restricted C it would look like this:

  SEC("xdp")
  int xdp_prog(struct xdp_md *ctx)
  {
        return bpf_xsk_redirect(ctx);
  }

The builtin program loaded via XDP_ATTACH behaves, from an
install-to-netdev/uninstall-from-netdev point of view, differently
from regular XDP programs. The easiest way to look at it is as a
2-level hierarchy, where regular XDP programs has precedence over the
builtin one.

If no regular XDP program is installed to the netdev, the builtin will
be install. If the builtin program is installed, and a regular is
installed, regular XDP program will have precedence over the builtin
one.

Further, if a regular program is installed, and later removed, the
builtin one will automatically be installed.

The sxdp_flags field of struct sockaddr_xdp gets two new options
XDP_BUILTIN_SKB_MODE and XDP_BUILTIN_DRV_MODE, which maps to the
corresponding XDP netlink install flags.

The builtin XDP program functionally adds even more complexity to the
already hard to read dev_change_xdp_fd. Maybe it would be simpler to
store the program in the struct net_device together with the install
flags instead of calling the ndo_bpf multiple times?

(As far as I can see from reading the code, correct me if I'm wrong.)

If an AF_XDP program uses XDP_ATTACH, then it installs the
builtin-program as the XDP program on the "entire" device.  That means
all RX-queues will call this XDP-bpf program (indirect call), and it is
actually only relevant for the specific queue_index.  Yes, the helper
call does check that the 'xdp->rxq->queue_index' for an attached 'xsk'
and return XDP_PASS if it is NULL:

Yes, you are correct. The builtin XDP program, just like a regular XDP
program, affects the whole netdev. So, yes the non-AF_XDP queues would
get a performance hit from this. Just to reiterate -- this isn't new
for this series. This has always been the case for XDP when acting on
just one queue.

quoted

+BPF_CALL_1(bpf_xdp_xsk_redirect, struct xdp_buff *, xdp)
+{
+       struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
+       struct xdp_sock *xsk;
+
+       xsk = READ_ONCE(xdp->rxq->dev->_rx[xdp->rxq->queue_index].xsk);
+       if (xsk) {
+               ri->xsk = xsk;
+               return XDP_REDIRECT;
+       }
+
+       return XDP_PASS;
+}

Why do every normal XDP_PASS packet have to pay this overhead (indirect
call), when someone loads an AF_XDP socket program?  The AF_XDP socket
is tied hard and only relevant to a specific RX-queue (which is why we
get a performance boost due to SPSC queues).

I acknowledge there is a need for this, but this use-case shows there
is a need for attaching XDP programs per RX-queue basis.

From my AF_XDP perspective, having a program per queue would make
sense. The discussion of a per-queue has been up before, and I think
the conclusion was that it would be too complex from a
configuration/tooling point-of-view. Again, for AF_XDP this would be
great.

I really think it is about time that we introduce these per-queue XDP
programs.  From a configuration/tooling point-of-view, your proposed
solution have the same issues, we have to solve. (Like listing
introspection need to show the per-queue XDP/BPF programs). And you have
almost solved it, by keeping the per-queue program in net_device
struct.

I just to emphasize; My series is only a convenience method for
loading a (sticky/hierarchical) XDP program and a new bpf-function.
Nothing more. I'm not sure what you mean when you say that "proposed
solution have the same issues". Do you mean XDP in general?

*I'm* all in for a per-queue XDP program -- but I might be biased due
to AF_XDP :-P Let's explore in this thread what that would look like
and the semantics of that!

Yes, please! (p.s. I'm going on vacation next week, which is
unfortunate as I really want to find a good semantics for this).

quoted

Alexei already requested more types of builtin programs.  But as the
XDP-driver API can only load XDP for the entire NIC, then how can
you use two builtin programs at the same time?

You can't. Again, this is just another way of loading an XDP program.
If socket A loads a builtin B on dev C, and socket X tries to load a
builtin Y on dev C this will fail/clash. This is not different from
today. My view on builtin has been "an XDP program provided by the
kernel". A related problem is if socket A attaches with XDP_DRV and
socket B attaches with XDP_SKB on the same netdev. (Ick, this is the
messy parts of XDP. :-()

This again show we need per-RX-queue XDP programs.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help