Thread (14 messages) 14 messages, 7 authors, 2019-07-22

Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

From: Michael Ellerman <mpe@ellerman.id.au>
Date: 2019-07-22 23:12:05
Also in: linux-iommu

Shawn Anastasio [off-list ref] writes:
On 7/22/19 7:16 AM, Michael Ellerman wrote:
quoted
Arnd Bergmann [off-list ref] writes:
quoted
On Thu, Jul 18, 2019 at 11:52 AM Christoph Hellwig [off-list ref] wrote:
quoted
On Thu, Jul 18, 2019 at 10:49:34AM +0200, Christoph Hellwig wrote:
quoted
On Thu, Jul 18, 2019 at 01:45:16PM +1000, Oliver O'Halloran wrote:
quoted
quoted
Other than m68k, mips, and arm64, everybody else that doesn't have
ARCH_NO_COHERENT_DMA_MMAP set uses this default implementation, so
I assume this behavior is acceptable on those architectures.
It might be acceptable, but there's no reason to use pgport_noncached
if the platform supports cache-coherent DMA.

Christoph (+cc) made the change so maybe he saw something we're missing.
I always found the forcing of noncached access even for coherent
devices a little odd, but this was inherited from the previous
implementation, which surprised me a bit as the different attributes
are usually problematic even on x86.  Let me dig into the history a
bit more, but I suspect the righ fix is to default to cached mappings
for coherent devices.
Ok, some history:

The generic dma mmap implementation, which we are effectively still
using today was added by:

commit 64ccc9c033c6089b2d426dad3c56477ab066c999
Author: Marek Szyprowski [off-list ref]
Date:   Thu Jun 14 13:03:04 2012 +0200

     common: dma-mapping: add support for generic dma_mmap_* calls

and unconditionally uses pgprot_noncached in dma_common_mmap, which is
then used as the fallback by dma_mmap_attrs if no ->mmap method is
present.  At that point we already had the powerpc implementation
that only uses pgprot_noncached for non-coherent mappings, and
the arm one, which uses pgprot_writecombine if DMA_ATTR_WRITE_COMBINE
is set and otherwise pgprot_dmacoherent, which seems to be uncached.
Arm did support coherent platforms at that time, but they might have
been an afterthought and not handled properly.
Cache-coherent devices are still very rare on 32-bit ARM.

Among the callers of dma_mmap_coherent(), almost all are in platform
specific device drivers that only ever run on noncoherent ARM SoCs,
which explains why nobody would have noticed problems.

There is also a difference in behavior between ARM and PowerPC
when dealing with mismatched cacheability attributes: If the same
page is mapped as both cached and uncached to, this may
cause silent undefined behavior on ARM, while PowerPC should
enter a checkstop as soon as it notices.
On newer Power CPUs it's actually more like the ARM behaviour.

I don't know for sure that it will *never* checkstop but there are at
least cases where it won't. There's some (not much) detail in the
Power8/9 user manuals.
The issue was discovered due to sporadic checkstops on P9, so it
seems like it will happen at least sometimes.
Yeah true. I wasn't sure if that checkstop was actually caused by a
cached/uncached mismatch or something else, but looks like it was, from
the hostboot issue (https://github.com/open-power/hostboot/issues/180):

  12.47015|  Signature Description    : pu.ex:k0:n0:s0:p00:c0 (L2FIR[16]) Cache line inhibited hit cacheable space


So I'm not really sure how to square that with the documentation in the
user manual:

  If a caching-inhibited load instruction hits in the L1 data cache, the
  load data is serviced from the L1 data cache and no request is sent to
  the NCU.
  If a caching-inhibited store instruction hits in the L1 data cache,
  the store data is written to the L1 data cache and sent to the NCU.
  Note that the L1 data cache and L2 cache are no longer coherent.

I guess I'm either misinterpreting that section or there's some *other*
documentation somewhere I haven't found that says that it will also
checkstop.

cheers
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help