Re: [PATCH net-next v2 5/6] bnxt_en: Optimize recovery path ULP locking in the driver
From: Simon Horman <horms@kernel.org>
Date: 2024-05-02 10:07:24
On Tue, Apr 30, 2024 at 05:30:55PM -0700, Michael Chan wrote:
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> In the error recovery path (AER, firmware recovery, etc), the driver notifies the RoCE driver via ULP_STOP before the reset and via ULP_START after the reset, all under RTNL_LOCK. The RoCE driver can take a long time if there are a lot of QPs to destroy, so it is not ideal to hold the global RTNL lock. Rely on the new en_dev_lock mutex instead for ULP_STOP and ULP_START. For the most part, we move the ULP_STOP call before we take the RTNL lock and move the ULP_START after RTNL unlock. Note that SRIOV re-enablement must be done after ULP_START or RoCE on the VFs will not resume properly after reset. The one scenario in bnxt_hwrm_if_change() where the RTNL lock is already taken in the .ndo_open() context requires the ULP restart to be deferred to the bnxt_sp_task() workqueue. Reviewed-by: Selvin Thyparampil Xavier <selvin.xavier@broadcom.com> Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
quoted hunk ↗ jump to hunk
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c index d9ea6fa23923..4cb0fabf977e 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c@@ -437,18 +437,20 @@ static int bnxt_dl_reload_down(struct devlink *dl, bool netns_change, switch (action) { case DEVLINK_RELOAD_ACTION_DRIVER_REINIT: { + bnxt_ulp_stop(bp); rtnl_lock(); if (bnxt_sriov_cfg(bp)) { NL_SET_ERR_MSG_MOD(extack, "reload is unsupported while VFs are allocated or being configured"); rtnl_unlock(); + bnxt_ulp_start(bp, 0); return -EOPNOTSUPP; } if (bp->dev->reg_state == NETREG_UNREGISTERED) { rtnl_unlock(); + bnxt_ulp_start(bp, 0); return -ENODEV;
Hi Selvin, Michael, all, FWIIW, I would have used a goto to unwind this and the previous error. No need to need to respin because of this.
quoted hunk ↗ jump to hunk
} - bnxt_ulp_stop(bp); if (netif_running(bp->dev)) bnxt_close_nic(bp, true, true); bnxt_vf_reps_free(bp);@@ -516,7 +518,6 @@ static int bnxt_dl_reload_up(struct devlink *dl, enum devlink_reload_action acti bnxt_vf_reps_alloc(bp); if (netif_running(bp->dev)) rc = bnxt_open_nic(bp, true, true); - bnxt_ulp_start(bp, rc); if (!rc) { bnxt_reenable_sriov(bp); bnxt_ptp_reapply_pps(bp);@@ -570,6 +571,8 @@ static int bnxt_dl_reload_up(struct devlink *dl, enum devlink_reload_action acti dev_close(bp->dev); } rtnl_unlock(); + if (action == DEVLINK_RELOAD_ACTION_DRIVER_REINIT) + bnxt_ulp_start(bp, rc); return rc; }