Re: [PATCH v3 bpf-next 2/8] veth: Add driver XDP
From: Toshiaki Makita <hidden>
Date: 2018-07-24 02:52:33
Hi Jakub, Thanks for reviewing! On 2018/07/24 9:23, Jakub Kicinski wrote:
On Mon, 23 Jul 2018 00:13:02 +0900, Toshiaki Makita wrote:quoted
From: Toshiaki Makita <redacted> This is the basic implementation of veth driver XDP. Incoming packets are sent from the peer veth device in the form of skb, so this is generally doing the same thing as generic XDP. This itself is not so useful, but a starting point to implement other useful veth XDP features like TX and REDIRECT. This introduces NAPI when XDP is enabled, because XDP is now heavily relies on NAPI context. Use ptr_ring to emulate NIC ring. Tx function enqueues packets to the ring and peer NAPI handler drains the ring. Currently only one ring is allocated for each veth device, so it does not scale on multiqueue env. This can be resolved by allocating rings on the per-queue basis later. Note that NAPI is not used but netif_rx is used when XDP is not loaded, so this does not change the default behaviour. v3: - Fix race on closing the device. - Add extack messages in ndo_bpf. v2: - Squashed with the patch adding NAPI. - Implement adjust_tail. - Don't acquire consumer lock because it is guarded by NAPI. - Make poll_controller noop since it is unnecessary. - Register rxq_info on enabling XDP rather than on opening the device. Signed-off-by: Toshiaki Makita <redacted>quoted
+static struct sk_buff *veth_xdp_rcv_skb(struct veth_priv *priv, + struct sk_buff *skb) +{ + u32 pktlen, headroom, act, metalen; + void *orig_data, *orig_data_end; + int size, mac_len, delta, off; + struct bpf_prog *xdp_prog; + struct xdp_buff xdp; + + rcu_read_lock(); + xdp_prog = rcu_dereference(priv->xdp_prog); + if (unlikely(!xdp_prog)) { + rcu_read_unlock(); + goto out; + } + + mac_len = skb->data - skb_mac_header(skb); + pktlen = skb->len + mac_len; + size = SKB_DATA_ALIGN(VETH_XDP_HEADROOM + pktlen) + + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + if (size > PAGE_SIZE) + goto drop; + + headroom = skb_headroom(skb) - mac_len; + if (skb_shared(skb) || skb_head_is_locked(skb) || + skb_is_nonlinear(skb) || headroom < XDP_PACKET_HEADROOM) { + struct sk_buff *nskb; + void *head, *start; + struct page *page; + int head_off; + + page = alloc_page(GFP_ATOMIC); + if (!page) + goto drop; + + head = page_address(page); + start = head + VETH_XDP_HEADROOM; + if (skb_copy_bits(skb, -mac_len, start, pktlen)) { + page_frag_free(head); + goto drop; + } + + nskb = veth_build_skb(head, + VETH_XDP_HEADROOM + mac_len, skb->len, + PAGE_SIZE); + if (!nskb) { + page_frag_free(head); + goto drop; + }quoted
+static int veth_enable_xdp(struct net_device *dev) +{ + struct veth_priv *priv = netdev_priv(dev); + int err; + + if (!xdp_rxq_info_is_reg(&priv->xdp_rxq)) { + err = xdp_rxq_info_reg(&priv->xdp_rxq, dev, 0); + if (err < 0) + return err; + + err = xdp_rxq_info_reg_mem_model(&priv->xdp_rxq, + MEM_TYPE_PAGE_SHARED, NULL);nit: doesn't matter much but looks like a mix of MEM_TYPE_PAGE_SHARED and MEM_TYPE_PAGE_ORDER0
Actually I'm not sure when to use MEM_TYPE_PAGE_ORDER0. It seems a page allocated by alloc_page() can be freed by page_frag_free() and it is more lightweight than put_page(), isn't it? virtio_net is doing it in a similar way. -- Toshiaki Makita