Thread (13 messages) 13 messages, 4 authors, 1d ago

Re: [PATCH v8 1/3] PCI: mediatek: Use actual physical address instead of virt_to_phys()

From: Manivannan Sadhasivam <mani@kernel.org>
Date: 2026-05-23 14:43:50
Also in: linux-mediatek, linux-mips, linux-pci, lkml

On Fri, May 22, 2026 at 05:43:25PM -0500, Bjorn Helgaas wrote:
On Thu, May 21, 2026 at 10:44:51AM +0530, Manivannan Sadhasivam wrote:
quoted
On Wed, May 20, 2026 at 02:59:00PM -0500, Bjorn Helgaas wrote:
quoted
On Wed, May 20, 2026 at 09:17:35PM +0200, Caleb James DeLisle wrote:
quoted
On 20/05/2026 20:55, Bjorn Helgaas wrote:
quoted
On Wed, May 20, 2026 at 06:38:25PM +0000, Caleb James DeLisle wrote:
quoted
From: Manivannan Sadhasivam <redacted>

The driver previously used virt_to_phys() on the ioremapped register base
(port->base) to compute the MSI message address. Using virt_to_phys() on an
IO mapped address is incorrect because it expects a kernel virtual address.

To fix it, store the physical start of the I/O register region in
mtk_pcie_port->phys_base and use it to build the MSI address. This replaces
the incorrect virt_to_phys() usage and ensures MSI addresses are generated
correctly.

Fixes: 43e6409db64d ("PCI: mediatek: Add MSI support for MT2712 and MT7622")
Signed-off-by: Manivannan Sadhasivam <redacted>
Tested-by: Caleb James DeLisle <cjd@cjdns.fr>
---
  drivers/pci/controller/pcie-mediatek.c | 16 +++++++++++++---
  1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/controller/pcie-mediatek.c b/drivers/pci/controller/pcie-mediatek.c
index 75722524fe74..c503fbd774d0 100644
--- a/drivers/pci/controller/pcie-mediatek.c
+++ b/drivers/pci/controller/pcie-mediatek.c
@@ -175,6 +175,7 @@ struct mtk_pcie_soc {
  /**
   * struct mtk_pcie_port - PCIe port information
   * @base: IO mapped register base
+ * @phys_base: Physical address of the I/O register base region
   * @list: port list
   * @pcie: pointer to PCIe host info
   * @reset: pointer to port reset control
@@ -196,6 +197,7 @@ struct mtk_pcie_soc {
   */
  struct mtk_pcie_port {
  	void __iomem *base;
+	phys_addr_t phys_base;
  	struct list_head list;
  	struct mtk_pcie *pcie;
  	struct reset_control *reset;
@@ -405,7 +407,7 @@ static void mtk_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
  	phys_addr_t addr;
  	/* MT2712/MT7622 only support 32-bit MSI addresses */
-	addr = virt_to_phys(port->base + PCIE_MSI_VECTOR);
+	addr = port->phys_base + PCIE_MSI_VECTOR;
This doesn't look right because the MSI address is a PCI bus address,
and port->phys_base is a CPU physical address.  Often a PCI bus
address is the same as the CPU physical address, but not always.
I think the DT 'ranges' property tells you the translation.
Oops, sorry, I muddied the waters here.

'ranges' tells you the translation applied by a bridge, e.g., when
a CPU does a load/store, the PCI host bridge turns it into a PCI
read/write transaction.  The bridge might add an offset to the CPU
load/store physical address to get the PCI read/write bus address.

But that's not the issue here.  The MSI is basically a DMA write
performed by the PCI device, not a store done by a CPU, so I don't
think 'ranges' is the right thing to look at.
Yeah, it is so easy to confuse both. To summarise, 'ranges'
describes the outbound translation and 'dma-ranges' describes the
inbound translation from host perspective.
quoted
Based on this:
https://elinux.org/Device_Tree_Usage#PCI_DMA_Address_Translation I
think 'dma-ranges' is the relevant property.  I don't think your
DT includes a 'dma-ranges' property, and in that case the default
is that the system bus (CPU) address is the same as the PCI
address.

So I think this patch works because it assumes DMA addresses like
the MSI address are mapped to identical system bus addresses.

It still seems to me that drivers should be prepared for the
presence of dma-ranges and use it when computing the MSI target
address.  But I don't think any drivers really do that, so for now
I think you should pretend that I never responded about this
patch.
Your observations are correct. This driver assumes that the
identical mapping exists between CPU and PCI bus addresses. Usually,
the drivers make use of phys_to_dma() to handle the translations.
What does this look like in the native host bridge drivers?  I don't
see any direct calls of phys_to_dma(), but there are some higher-level
interfaces that use it.

I don't really see a consistent style of constructing MSI addresses,
e.g., in *_compose_msi_msg() implementations.
That's because, each driver may use different methods for constructing the MSI
address. Two commonly used methods are:

1. Using a pre-defined MSI address in the MMIO range (like the pcie-mediatek
driver). The PCIe controller hardware decodes MWr TLPs targeting this address
and raises an interrupt to the CPU via the platform interrupt controller (e.g.
GIC).

2. Allocating coherent memory using dma_alloc_coherent() and programming it
into platform MMIO registers (like pcie-designware-host driver). The Root
Complex inbound ATU matches writes to this address and raises an interrupt to
the CPU.
quoted
This API internally makes use of the 'dma_range_map' which gets
populated by the OF core based on the 'dma-ranges' property (if
present in DT).

But it makes sense to use it irrespective of whether the platform
supports non-identical DMA/inbound translation or not. Since this
API behaves like a no-op and returns the CPU physical address if
there is an identical mapping, there is literally zero overhead in
using it.
Thanks for rescuing me.

I wonder if there should be something in
Documentation/core-api/dma-api* about this.  I guess that is mostly
oriented toward things like PCI device drivers, not so much PCI host
bridge drivers.  But it would be nice to have a little intro to
dma-ranges and maybe even the restricted DMA usage.
I think this information should belong to the PCI host controller documentation.
I started writing it, but didn't get time to finish and post it. Maybe I'll
atleast post an initial version and add more info on top.

- Mani

-- 
மணிவண்ணன் சதாசிவம்
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help