Thread (60 messages) 60 messages, 10 authors, 2024-01-20
STALE889d

[PATCH net-next 19/24] net: fungible, gve, mtk, microchip, mana: Use nested-BH locking for XDP redirect.

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: 2023-12-15 17:10:42
Also in: bpf, linux-hyperv, linux-mediatek, lkml
Subsystem: fungible ethernet drivers, hyper-v/azure core and drivers, mediatek ethernet driver, microchip lan966x ethernet driver, networking drivers, the rest, xdp (express data path) · Maintainers: Dimitris Michailidis, "K. Y. Srinivasan", Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, Felix Fietkau, Lorenzo Bianconi, Horatiu Vultur, Andrew Lunn, "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds, Alexei Starovoitov, Daniel Borkmann, David S. Miller, Jesper Dangaard Brouer, John Fastabend

The per-CPU variables used during bpf_prog_run_xdp() invocation and
later during xdp_do_redirect() rely on disabled BH for their protection.
Without locking in local_bh_disable() on PREEMPT_RT these data structure
require explicit locking.

This is a follow-up on the previous change which introduced
bpf_run_lock.redirect_lock and uses it now within drivers.

The simple way is to acquire the lock before bpf_prog_run_xdp() is
invoked and hold it until the end of function.
This does not always work because some drivers (cpsw, atlantic) invoke
xdp_do_flush() in the same context.
Acquiring the lock in bpf_prog_run_xdp() and dropping in
xdp_do_redirect() (without touching drivers) does not work because not
all driver, which use bpf_prog_run_xdp(), do support XDP_REDIRECT (and
invoke xdp_do_redirect()).

Ideally the minimal locking scope would be bpf_prog_run_xdp() +
xdp_do_redirect() and everything else (error recovery, DMA unmapping,
free/ alloc of memory, …) would happen outside of the locked section.

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Dimitris Michailidis <dmichail@fungible.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Cc: Jeroen de Borst <redacted>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: John Crispin <john@phrozen.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: Mark Lee <redacted>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Praveen Kaligineedi <redacted>
Cc: Sean Wang <sean.wang@mediatek.com>
Cc: Shailend Chand <redacted>
Cc: UNGLinuxDriver@microchip.com
Cc: Wei Liu <wei.liu@kernel.org>
Cc: bpf@vger.kernel.org
Cc: linux-hyperv@vger.kernel.org
Cc: linux-mediatek@lists.infradead.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/net/ethernet/fungible/funeth/funeth_rx.c     |  1 +
 drivers/net/ethernet/google/gve/gve_rx.c             | 12 +++++++-----
 drivers/net/ethernet/mediatek/mtk_eth_soc.c          |  1 +
 drivers/net/ethernet/microchip/lan966x/lan966x_xdp.c |  1 +
 drivers/net/ethernet/microsoft/mana/mana_bpf.c       |  1 +
 5 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/fungible/funeth/funeth_rx.c b/drivers/net/ethernet/fungible/funeth/funeth_rx.c
index 7e2584895de39..e7b1382545908 100644
--- a/drivers/net/ethernet/fungible/funeth/funeth_rx.c
+++ b/drivers/net/ethernet/fungible/funeth/funeth_rx.c
@@ -152,6 +152,7 @@ static void *fun_run_xdp(struct funeth_rxq *q, skb_frag_t *frags, void *buf_va,
 	xdp_prepare_buff(&xdp, buf_va, FUN_XDP_HEADROOM, skb_frag_size(frags) -
 			 (FUN_RX_TAILROOM + FUN_XDP_HEADROOM), false);
 
+	guard(local_lock_nested_bh)(&bpf_run_lock.redirect_lock);
 	xdp_prog = READ_ONCE(q->xdp_prog);
 	act = bpf_prog_run_xdp(xdp_prog, &xdp);
 
diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c
index 73655347902d2..504c8ef761a33 100644
--- a/drivers/net/ethernet/google/gve/gve_rx.c
+++ b/drivers/net/ethernet/google/gve/gve_rx.c
@@ -779,11 +779,13 @@ static void gve_rx(struct gve_rx_ring *rx, netdev_features_t feat,
 				 page_info->page_offset, GVE_RX_PAD,
 				 len, false);
 		old_data = xdp.data;
-		xdp_act = bpf_prog_run_xdp(xprog, &xdp);
-		if (xdp_act != XDP_PASS) {
-			gve_xdp_done(priv, rx, &xdp, xprog, xdp_act);
-			ctx->total_size += frag_size;
-			goto finish_ok_pkt;
+		scoped_guard(local_lock_nested_bh, &bpf_run_lock.redirect_lock) {
+			xdp_act = bpf_prog_run_xdp(xprog, &xdp);
+			if (xdp_act != XDP_PASS) {
+				gve_xdp_done(priv, rx, &xdp, xprog, xdp_act);
+				ctx->total_size += frag_size;
+				goto finish_ok_pkt;
+			}
 		}
 
 		page_info->pad += xdp.data - old_data;
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 3cf6589cfdacf..477a74ee18c0a 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1946,6 +1946,7 @@ static u32 mtk_xdp_run(struct mtk_eth *eth, struct mtk_rx_ring *ring,
 	if (!prog)
 		goto out;
 
+	guard(local_lock_nested_bh)(&bpf_run_lock.redirect_lock);
 	act = bpf_prog_run_xdp(prog, xdp);
 	switch (act) {
 	case XDP_PASS:
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_xdp.c b/drivers/net/ethernet/microchip/lan966x/lan966x_xdp.c
index 9ee61db8690b4..026311af07f9e 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_xdp.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_xdp.c
@@ -84,6 +84,7 @@ int lan966x_xdp_run(struct lan966x_port *port, struct page *page, u32 data_len)
 	xdp_prepare_buff(&xdp, page_address(page),
 			 IFH_LEN_BYTES + XDP_PACKET_HEADROOM,
 			 data_len - IFH_LEN_BYTES, false);
+	guard(local_lock_nested_bh)(&bpf_run_lock.redirect_lock);
 	act = bpf_prog_run_xdp(xdp_prog, &xdp);
 	switch (act) {
 	case XDP_PASS:
diff --git a/drivers/net/ethernet/microsoft/mana/mana_bpf.c b/drivers/net/ethernet/microsoft/mana/mana_bpf.c
index 23b1521c0df96..d465b1dd9fca0 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_bpf.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_bpf.c
@@ -93,6 +93,7 @@ u32 mana_run_xdp(struct net_device *ndev, struct mana_rxq *rxq,
 	xdp_init_buff(xdp, PAGE_SIZE, &rxq->xdp_rxq);
 	xdp_prepare_buff(xdp, buf_va, XDP_PACKET_HEADROOM, pkt_len, false);
 
+	guard(local_lock_nested_bh)(&bpf_run_lock.redirect_lock);
 	act = bpf_prog_run_xdp(prog, xdp);
 
 	rx_stats = &rxq->stats;
-- 
2.43.0
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help