Re: [PATCH net v3 2/2] net: mana: Skip redundant detach on already-detached port
From: Paolo Abeni <pabeni@redhat.com>
Date: 2026-05-28 09:30:47
Also in:
bpf, linux-hyperv, linux-rdma, lkml
On 5/25/26 10:08 AM, Dipayaan Roy wrote:
quoted hunk ↗ jump to hunk
When mana_per_port_queue_reset_work_handler() runs after a previous detach succeeded but attach failed, the port is left in a detached state with apc->tx_qp and apc->rxqs already freed. Calling mana_detach() again unconditionally leads to NULL pointer dereferences during queue teardown. Add an early exit in mana_detach() when the port is already in detached state (!netif_device_present) for non-close callers, making it safe to call idempotently. This allows the queue reset handler and other recovery paths to simply retry mana_attach() without redundant teardown. Fixes: 3b194343c250 ("net: mana: Implement ndo_tx_timeout and serialize queue resets per port.") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dipayaan Roy <redacted> --- drivers/net/ethernet/microsoft/mana/mana_en.c | 6 ++++++ 1 file changed, 6 insertions(+)diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 0582803907a8..1e1ad2795c3c 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c@@ -3350,6 +3350,12 @@ int mana_detach(struct net_device *ndev, bool from_close) ASSERT_RTNL(); + /* If already detached (indicates detach succeeded but attach failed + * previously). Now skip mana detach and just retry mana_attach. + */ + if (!from_close && !netif_device_present(ndev)) + return 0; + apc->port_st_save = apc->port_is_up; apc->port_is_up = false;
sashiko(gemini) notes the above can lead to different race: --- Can this early return cause state machine corruption by bypassing the updates to apc->port_st_save? Consider this sequence: 1. queue_reset_work runs, mana_detach() succeeds (apc->port_st_save = true, apc->port_is_up = false), but mana_attach() fails. 2. The admin brings the interface down (ip link set dev eth0 down), skipping mana_close() since apc->port_is_up is false. 3. The admin changes the MTU, triggering mana_change_mtu() which calls mana_detach() followed by mana_attach(). 4. mana_detach() hits this new early return, preserving apc->port_st_save == true. When mana_attach() runs, it sees apc->port_st_save == true and allocates queues, setting apc->vport_use_count = 1 and apc->port_is_up = true, even though the interface is administratively down. If the admin then brings the interface up, mana_open() will unconditionally call mana_alloc_queues(). That function calls mana_cfg_vport(), which will return -EBUSY because apc->vport_use_count is already 1. This leaves mana_open() failing and the interface down. Since the interface is already down, trying to bring it down again is a no-op, meaning mana_close() is never called to clean up the orphaned queues. Does this sequence permanently brick the port until the driver is reloaded? --- I think you need to be more restrictive in the early return check. /P