Thread (21 messages) 21 messages, 3 authors, 2021-03-15

Re: [PATCH v2 0/5] Add option to mmap GEM buffers cached

From: Thomas Zimmermann <tzimmermann@suse.de>
Date: 2021-03-11 07:53:08
Also in: dri-devel, lkml

Hi

Am 10.03.21 um 20:02 schrieb Paul Cercueil:
Hi Thomas,

Le lun. 8 mars 2021 à 9:41, Thomas Zimmermann [off-list ref] a 
écrit :
quoted
Hi Paul,

having individual functions for each mode only makes sense if the 
decision is at compile time. But in patch 5, you're working around 
your earlier design by introducing in-driver helpers that select the 
correct CMA function.

In SHMEM helpers we have the flag map_wc in the GEM structure that 
selects the pages caching mode (wc vs uncached). I think CMA should 
Re-reading this, it should rather be WC and cached.
quoted
use this design as well. Have a map_noncoherent flag in the CMA GEM 
object and set it from the driver's implementation of gem_create_object.
Is that a new addition? That severely reduces the patchset size, which 
is perfect.
It was added a few months ago, around the time you send the first 
version of the patches at hand.

Originally, SHMEM uses write combining by default. And several drivers 
used default mapping flags instead (so usually cached). IIRC I 
streamlined everything and flipped the default. Most drivers can use 
cached mappings and only some require WC.

Recently there was also a patchset that added support for cached video 
RAM (or something like that). So at some point we could store these 
flags in drm_gem_object. For now, I'd just put a flag into 
drm_gem_cma_object.
I'll send a V3 then.
Great!

Best regards
Thomas
Cheers,
-Paul
quoted
And in the long run, we could try to consolidate all drivers/helpers 
mapping flags in struct drm_gem_object.

Best regards
Thomas

Am 07.03.21 um 21:28 schrieb Paul Cercueil:
quoted
Rework of my previous patchset which added support for GEM buffers
backed by non-coherent memory to the ingenic-drm driver.

Having GEM buffers backed by non-coherent memory is interesting in
the particular case where it is faster to render to a non-coherent
buffer then sync the data cache, than to render to a write-combine
buffer, and (by extension) much faster than using a shadow buffer.
This is true for instance on some Ingenic SoCs, where even simple
blits (e.g. memcpy) are about three times faster using this method.

For the record, the previous patchset was accepted for 5.10 then had
to be reverted, as it conflicted with some changes made to the DMA API.

This new patchset is pretty different as it adds the functionality to
the DRM core. The first three patches add variants to existing functions
but with the "non-coherent memory" twist, exported as GPL symbols. The
fourth patch adds a function to be used with the damage helpers.
Finally, the last patch adds support for non-coherent GEM buffers to the
ingenic-drm driver. The functionality is enabled through a module
parameter, and is disabled by default.

Cheers,
-Paul

Paul Cercueil (5):
   drm: Add and export function drm_gem_cma_create_noncoherent
   drm: Add and export function drm_gem_cma_dumb_create_noncoherent
   drm: Add and export function drm_gem_cma_mmap_noncoherent
   drm: Add and export function drm_gem_cma_sync_data
   drm/ingenic: Add option to alloc cached GEM buffers

  drivers/gpu/drm/drm_gem_cma_helper.c      | 223 +++++++++++++++++++---
  drivers/gpu/drm/ingenic/ingenic-drm-drv.c |  49 ++++-
  drivers/gpu/drm/ingenic/ingenic-drm.h     |   4 +
  drivers/gpu/drm/ingenic/ingenic-ipu.c     |  14 +-
  include/drm/drm_gem_cma_helper.h          |  13 ++
  5 files changed, 273 insertions(+), 30 deletions(-)
-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help