Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift
From: Nishanth Aravamudan <hidden>
Date: 2015-10-28 01:55:04
Also in:
linux-nvme, lkml, sparclinux
On 28.10.2015 [12:00:20 +1100], Alexey Kardashevskiy wrote:
On 10/28/2015 09:27 AM, Nishanth Aravamudan wrote:quoted
On 27.10.2015 [17:02:16 +1100], Alexey Kardashevskiy wrote:quoted
On 10/24/2015 07:57 AM, Nishanth Aravamudan wrote:quoted
On Power, the kernel's page size can differ from the IOMMU's page size, so we need to override the generic implementation, which always returns the kernel's page size. Lookup the IOMMU's page size from struct iommu_table, if available. Fallback to the kernel's page size, otherwise. Signed-off-by: Nishanth Aravamudan <redacted> --- arch/powerpc/include/asm/dma-mapping.h | 3 +++ arch/powerpc/kernel/dma.c | 9 +++++++++ 2 files changed, 12 insertions(+)diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h index 7f522c0..c5638f4 100644 --- a/arch/powerpc/include/asm/dma-mapping.h +++ b/arch/powerpc/include/asm/dma-mapping.h@@ -125,6 +125,9 @@ static inline void set_dma_offset(struct device *dev, dma_addr_t off) #define HAVE_ARCH_DMA_SET_MASK 1 extern int dma_set_mask(struct device *dev, u64 dma_mask); +#define HAVE_ARCH_DMA_GET_PAGE_SHIFT 1 +extern unsigned long dma_get_page_shift(struct device *dev); + #include <asm-generic/dma-mapping-common.h> extern int __dma_set_mask(struct device *dev, u64 dma_mask);diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c index 59503ed..e805af2 100644 --- a/arch/powerpc/kernel/dma.c +++ b/arch/powerpc/kernel/dma.c@@ -335,6 +335,15 @@ int dma_set_mask(struct device *dev, u64 dma_mask) } EXPORT_SYMBOL(dma_set_mask); +unsigned long dma_get_page_shift(struct device *dev) +{ + struct iommu_table *tbl = get_iommu_table_base(dev); + if (tbl) + return tbl->it_page_shift;All PCI devices have this initialized on POWER (at least, our, IBM's POWER) so 4K will always be returned here while in the case of (get_dma_ops(dev)==&dma_direct_ops) it could actually return PAGE_SHIFT. Is 4K still preferred value to return here?Right, so the logic of my series, goes like this: a) We currently are assuming DMA_PAGE_SHIFT (conceptual constant) is PAGE_SHIFT everywhere, including Power. b) After 2/7, the Power code will return either the IOMMU table's shift value, if set, or PAGE_SHIFT (I guess this would be the case if get_dma_ops(dev) == &dma_direct_ops, as you said). That is no different than we have now, except we can return the accurate IOMMU value if available.If it is not available, then something went wrong and BUG_ON(!tbl || !tbl->it_page_shift) make more sense here than pretending that this function can ever return PAGE_SHIFT. imho.
That's a good point, thanks!
quoted
3) After 3/7, the platform can override the generic Power get_dma_page_shift(). 4) After 4/7, pseries will return the DDW value, if available, then fallback to the IOMMU table's value. I think in the case of get_dma_ops(dev)==&dma_direct_ops, the only way that can happen is if we are using DDW, right?This is for pseries guests; for the powernv host it is a "bypass" mode which does 64bit direct DMA mapping and there is no additional window for that (i.e. DIRECT64_PROPNAME, etc).
You're right! I should update the code to handle both cases. In "bypass" mode, what TCE size is used? Is it guaranteed to be 4K? Seems like this would be a different platform implentation I'd put in for 'powernv', is that right? My apologies for missing that, and thank you for the review! -Nish