Re: [PATCH net] net/mlx5: fix calling mlx5_cmd_init() before DMA mask is set
From: Niklas Schnelle <schnelle@linux.ibm.com>
Date: 2023-09-29 09:41:15
Also in:
linux-rdma, linux-s390, lkml
On Thu, 2023-09-28 at 20:59 +0300, Leon Romanovsky wrote:
quoted hunk ↗ jump to hunk
On Thu, Sep 28, 2023 at 03:55:47PM +0200, Niklas Schnelle wrote:quoted
Since commit 06cd555f73ca ("net/mlx5: split mlx5_cmd_init() to probe and reload routines") mlx5_cmd_init() is called in mlx5_mdev_init() which is called in probe_one() before mlx5_pci_init(). This is a problem because mlx5_pci_init() is where the DMA and coherent mask is set but mlx5_cmd_init() already does a dma_alloc_coherent(). Thus a DMA allocation is done during probe before the correct mask is set. This causes probe to fail initialization of the cmdif SW structs on s390x after that is converted to the common dma-iommu code. This is because on s390x DMA addresses below 4 GiB are reserved on current machines and unlike the old s390x specific DMA API implementation common code enforces DMA masks. Fix this by switching the order of the mlx5_mdev_init() and mlx5_pci_init() in probe_one(). Link: https://lore.kernel.org/linux-iommu/cfc9e9128ed5571d2e36421e347301057662a09e.camel@linux.ibm.com/ (local) Fixes: 06cd555f73ca ("net/mlx5: split mlx5_cmd_init() to probe and reload routines") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> --- Note: I ran into this while testing the linked series for converting s390x to use dma-iommu. The existing s390x specific DMA API implementation doesn't respect DMA masks and is thus not affected despite of course also only supporting DMA addresses above 4 GiB. That said ConnectX VFs are the primary users of native PCI on s390x and we'd really like to get the DMA API conversion into v6.7 so this has high priority for us. --- drivers/net/ethernet/mellanox/mlx5/core/main.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 15561965d2af..06744dedd928 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c@@ -1908,10 +1908,6 @@ static int probe_one(struct pci_dev *pdev, const struct pci_device_id *id) goto adev_init_err; } - err = mlx5_mdev_init(dev, prof_sel); - if (err) - goto mdev_init_err; - err = mlx5_pci_init(dev, pdev, id); if (err) { mlx5_core_err(dev, "mlx5_pci_init failed with error code %d\n",@@ -1919,6 +1915,10 @@ static int probe_one(struct pci_dev *pdev, const struct pci_device_id *id) goto pci_init_err; } + err = mlx5_mdev_init(dev, prof_sel); + if (err) + goto mdev_init_err; +I had something different in mind as I'm worry that call to pci_enable_device() in mlx5_pci_init() before we finished FW command interface initialization is a bit premature. What about the following patch?diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 15561965d2af..31f1d701116a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c@@ -905,12 +905,6 @@ static int mlx5_pci_init(struct mlx5_core_dev *dev, struct pci_dev *pdev, pci_set_master(pdev); - err = set_dma_caps(pdev); - if (err) { - mlx5_core_err(dev, "Failed setting DMA capabilities mask, aborting\n"); - goto err_clr_master; - } - if (pci_enable_atomic_ops_to_root(pdev, PCI_EXP_DEVCAP2_ATOMIC_COMP32) && pci_enable_atomic_ops_to_root(pdev, PCI_EXP_DEVCAP2_ATOMIC_COMP64) && pci_enable_atomic_ops_to_root(pdev, PCI_EXP_DEVCAP2_ATOMIC_COMP128))@@ -1908,9 +1902,15 @@ static int probe_one(struct pci_dev *pdev, const struct pci_device_id *id) goto adev_init_err; } + err = set_dma_caps(pdev); + if (err) { + mlx5_core_err(dev, "Failed setting DMA capabilities mask, aborting\n"); + goto dma_cap_err; + } + err = mlx5_mdev_init(dev, prof_sel); if (err) - goto mdev_init_err; + goto dma_cap_err; err = mlx5_pci_init(dev, pdev, id); if (err) {@@ -1942,7 +1942,7 @@ static int probe_one(struct pci_dev *pdev, const struct pci_device_id *id) mlx5_pci_close(dev); pci_init_err: mlx5_mdev_uninit(dev); -mdev_init_err: +dma_cap_err: mlx5_adev_idx_free(dev->priv.adev_idx); adev_init_err: mlx5_devlink_free(devlink);Thanks
The above works too. Maybe for consistency within probe_one() it would then make sense to also rename set_dma_caps() to mlx5_dma_init()? Thanks, Niklas