Thread (8 messages) 8 messages, 2 authors, 2023-07-03

Re: [RFC bpf-next] xsk: honor SO_BINDTODEVICE on bind

From: Magnus Karlsson <hidden>
Date: 2023-07-03 09:49:03
Also in: bpf, lkml, netdev

On Fri, 30 Jun 2023 at 16:58, Ilya Maximets [off-list ref] wrote:
Initial creation of an AF_XDP socket requires CAP_NET_RAW capability.
A privileged process might create the socket and pass it to a
non-privileged process for later use.  However, that process will be
able to bind the socket to any network interface.  Even though it will
not be able to receive any traffic without modification of the BPF map,
the situation is not ideal.

Sockets already have a mechanism that can be used to restrict what
interface they can be attached to.  That is SO_BINDTODEVICE.

To change the binding the process will need CAP_NET_RAW.

Make xsk_bind() honor the SO_BINDTODEVICE in order to allow safer
workflow when non-privileged process is using AF_XDP.
Rebinding an AF_XDP socket is not allowed today. Any such attempt will
return an error from bind. So if I understand the purpose of
SO_BINDTODEVICE correctly, you could say that this option is always
set for an AF_XDP socket and it is not possible to toggle it. The only
way to "rebind" an AF_XDP socket is to close it and open a new one.
This was a conscious design decision from day one as it would be very
hard to support this, especially in zero-copy mode.
quoted hunk ↗ jump to hunk
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---

Posting as an RFC for now to probably get some feedback.
Will re-post once the tree is open.

 Documentation/networking/af_xdp.rst | 9 +++++++++
 net/xdp/xsk.c                       | 6 ++++++
 2 files changed, 15 insertions(+)
diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst
index 247c6c4127e9..1cc35de336a4 100644
--- a/Documentation/networking/af_xdp.rst
+++ b/Documentation/networking/af_xdp.rst
@@ -433,6 +433,15 @@ start N bytes into the buffer leaving the first N bytes for the
 application to use. The final option is the flags field, but it will
 be dealt with in separate sections for each UMEM flag.

+SO_BINDTODEVICE setsockopt
+--------------------------
+
+This is a generic SOL_SOCKET option that can be used to tie AF_XDP
+socket to a particular network interface.  It is useful when a socket
+is created by a privileged process and passed to a non-privileged one.
+Once the option is set, kernel will refuse attempts to bind that socket
+to a different interface.  Updating the value requires CAP_NET_RAW.
+
 XDP_STATISTICS getsockopt
 -------------------------
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 5a8c0dd250af..386ff641db0f 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -886,6 +886,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
        struct sock *sk = sock->sk;
        struct xdp_sock *xs = xdp_sk(sk);
        struct net_device *dev;
+       int bound_dev_if;
        u32 flags, qid;
        int err = 0;
@@ -899,6 +900,11 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
                      XDP_USE_NEED_WAKEUP))
                return -EINVAL;

+       bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
+
+       if (bound_dev_if && bound_dev_if != sxdp->sxdp_ifindex)
+               return -EINVAL;
+
        rtnl_lock();
        mutex_lock(&xs->mutex);
        if (xs->state != XSK_READY) {
--
2.40.1
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help