Thread (77 messages) 77 messages, 8 authors, 2020-09-03

Re: [PATCH 05/28] media/v4l2: remove V4L2-FLAG-MEMORY-NON-CONSISTENT

From: Tomasz Figa <tfiga@chromium.org>
Date: 2020-08-19 14:22:57
Also in: alsa-devel, linux-doc, linux-iommu, linux-media, linux-mips, linux-mm, linux-nvme, linux-samsung-soc, linux-scsi, lkml, netdev, nouveau

On Wed, Aug 19, 2020 at 4:07 PM Robin Murphy [off-list ref] wrote:
On 2020-08-19 13:49, Tomasz Figa wrote:
quoted
On Wed, Aug 19, 2020 at 1:51 PM Robin Murphy [off-list ref] wrote:
quoted
Hi Tomasz,

On 2020-08-19 12:16, Tomasz Figa wrote:
quoted
Hi Christoph,

On Wed, Aug 19, 2020 at 8:56 AM Christoph Hellwig [off-list ref] wrote:
quoted
The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused,
Could you explain what makes you think it's unused? It's a feature of
the UAPI generally supported by the videobuf2 framework and relied on
by Chromium OS to get any kind of reasonable performance when
accessing V4L2 buffers in the userspace.
quoted
and causes
weird gymanstics with the DMA_ATTR_NON_CONSISTENT flag, which is
unimplemented except on PARISC and some MIPS configs, and about to be
removed.
It is implemented by the generic DMA mapping layer [1], which is used
by a number of architectures including ARM64 and supposed to be used
by new architectures going forward.
AFAICS all that V4L2_FLAG_MEMORY_NON_CONSISTENT does is end up
controling whether DMA_ATTR_NON_CONSISTENT is added to vb2_queue::dma_attrs.

Please can you point to where DMA_ATTR_NON_CONSISTENT does anything at
all on arm64?
With the default config it doesn't, but with
CONFIG_DMA_NONCOHERENT_CACHE_SYNC enabled it makes dma_pgprot() keep
the pgprot value as is, without enforcing coherence attributes.
How active are the PA-RISC and MIPS ports of Chromium OS?
Not active. We enable CONFIG_DMA_NONCOHERENT_CACHE_SYNC for ARM64,
given the directions received back in April when discussing the
noncoherent memory functionality on the mailing list in the thread I
pointed out in my previous message and no clarification on why it is
disabled for ARM64 in upstream, despite making several attempts to get
some.
Hacking CONFIG_DMA_NONCOHERENT_CACHE_SYNC into an architecture that
doesn't provide dma_cache_sync() is wrong, since at worst it may break
other drivers. If downstream is wildly misusing an API then so be it,
but it's hardly a strong basis for an upstream argument.
I guess it means that we're wildly misusing the API, but it still does
work. Could you explain how it could break other drivers?
quoted
quoted
Also, I posit that videobuf2 is not actually relying on
DMA_ATTR_NON_CONSISTENT anyway, since it's clearly not using it properly:

"By using this API, you are guaranteeing to the platform
that you have all the correct and necessary sync points for this memory
in the driver should it choose to return non-consistent memory."

$ git grep dma_cache_sync drivers/media
$
AFAIK dma_cache_sync() isn't the only way to perform the cache
synchronization. The earlier patch series that I reviewed relied on
dma_get_sgtable() and then dma_sync_sg_*() (which existed in the
vb2-dc since forever [1]). However, it looks like with the final code
the sgtable isn't acquired and the synchronization isn't happening, so
you have a point.
Using the streaming sync calls on coherent allocations has also always
been wrong per the API, regardless of the bodies of code that have
happened to get away with it for so long.
quoted
FWIW, I asked back in time what the plan is for non-coherent
allocations and it seemed like DMA_ATTR_NON_CONSISTENT and
dma_sync_*() was supposed to be the right thing to go with. [2] The
same thread also explains why dma_alloc_pages() isn't suitable for the
users of dma_alloc_attrs() and DMA_ATTR_NON_CONSISTENT.
AFAICS even back then Christoph was implying getting rid of
NON_CONSISTENT and *replacing* it with something streaming-API-based -
That's not how I read his reply from the thread I pointed to, but that
might of course be my misunderstanding.
i.e. this series - not encouraging mixing the existing APIs. It doesn't
seem impossible to implement a remapping version of this new
dma_alloc_pages() for IOMMU-backed ops if it's really warranted
(although at that point it seems like "non-coherent" vb2-dc starts to
have significant conceptual overlap with vb2-sg).
No, there is no overlap between vb2-dc and vb2-sg. They differ on
another level - the former is to be used by devices without
scatter-gather or internal mapping capabilities and gives the driver a
single DMA address for the whole buffer, regardless of whether it's
IOVA-contiguous (for devices behind an IOMMU) or physically contiguous
(for the others), while the latter gives the driver an sgtable, which
of course may be DMA-contiguous internally, but doesn't have to and
usually isn't. This model makes it possible to hide the SoC
implementation details from particular drivers, since those are very
often reused on many SoCs which differ in the availability of IOMMU,
DMA addressing restrictions and so on.

Best regards,
Tomasz

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help