Re: Tegra DRM device tree bindings

From: Lucas Stach <hidden>
Date: 2012-06-28 17:01:29
Also in: dri-devel, linux-iommu, linux-tegra

Hi Thierry,

Am Donnerstag, den 28.06.2012, 13:12 +0200 schrieb Thierry Reding:

On Wed, Jun 27, 2012 at 05:59:55PM +0200, Lucas Stach wrote:

quoted

Am Mittwoch, den 27.06.2012, 16:44 +0200 schrieb Thierry Reding:

quoted

On Wed, Jun 27, 2012 at 05:29:14PM +0300, Hiroshi Doyu wrote:

quoted

On Wed, 27 Jun 2012 16:08:10 +0200
Thierry Reding [off-list ref] wrote:

quoted

* PGP Signed by an unknown key

On Wed, Jun 27, 2012 at 03:59:07PM +0300, Hiroshi Doyu wrote:

quoted

On Wed, 27 Jun 2012 07:14:18 +0200
Thierry Reding [off-list ref] wrote:

quoted

Old Signed by an unknown key

On Tue, Jun 26, 2012 at 08:48:18PM -0600, Stephen Warren wrote:

quoted

On 06/26/2012 08:32 PM, Mark Zhang wrote:

quoted

On 06/26/2012 07:46 PM, Mark Zhang wrote:

quoted

On Tue, 26 Jun 2012 12:55:13 +0200
Thierry Reding [off-list ref] wrote:

...

quoted

I'm not sure I understand how information about the carveout would be
obtained from the IOMMU API, though.

I think that can be similar with current gart implementation. Define carveout as:

carveout {
        compatible = "nvidia,tegra20-carveout";
        size = <0x10000000>;
};

Then create a file such like "tegra-carveout.c" to get these definitions and

register itself as platform device's iommu instance.

The carveout isn't a HW object, so it doesn't seem appropriate to define a DT
node to represent it.

Yes. But I think it's better to export the size of carveout as a configurable item.
So we need to define this somewhere. How about define carveout as a property of gart?

There already exists a way of preventing Linux from using certain chunks
of memory; the /memreserve/ syntax. From a brief look at the dtc source,
it looks like /memreserve/ entries can have labels, which implies that a
property in the GART node could refer to the /memreserve/ entry by
phandle in order to know what memory regions to use.

Wasn't the whole point of using a carveout supposed to be a replacement
for the GART?

Mostly agree. IIUC, we use both carveout/gart allocated buffers in
android/tegra2.

quoted

As such I'd think the carveout should rather be a property
of the host1x device.

Rather than introducing a new property, how about using
"coherent_pool=??M" in the kernel command line if necessary? I think
that this carveout size depends on the system usage/load.

I was hoping that we could get away with using the CMA and perhaps
initialize it based on device tree content. I agree that the carveout
size depends on the use-case, but I still think it makes sense to
specify it on a per-board basis.

DRM driver doesn't know if it uses CMA or not, because DRM only uses
DMA API.

So how is the DRM supposed to allocate buffers? Does it call the
dma_alloc_from_contiguous() function to do that? I can see how it is
used by arm_dma_ops but how does it end up in the driver?

As I said before the DMA API is not a good fit for graphics drivers.
Most of the DMA buffers used by graphics cores are long lived and big,
so we need a special pool to alloc from to avoid eating all contiguous
address space, as DMA API does not provide shrinker callbacks for
clients using large amount of memory.

I recall you mentioning TTM as a better alternative several times in the
past. How does it fit in with this? Does it have the capability of using
a predefined chunk of contiguous memory as a pool to allocate from?

One problem that all of these solutions don't address is that not all
devices below host1x are DRM related. At least for the CSI and VI blocks
I expect there to be V4L2 drivers eventually, so what we really need is
to manage allocations outside of the DRM. host1x is the most logical
choice here.

I think you are right here. We might want to move all those
buffer/memory management in the host1x code and provide contig memory to
the host1x clients from there.

TTM has the ability to manage a chunk of memory for contig allocations.
Also I think TTM does not depend too heavily on DRM, so we may even be
able to use TTM as the general allocator for host1x clients, including
VI and others. The more advanced stuff in TTM like swapping and moving
buffers might be a bit of overkill for simple stuff like V4L, where you
basically just want something like: "give me a contig buffer and pin it
in address space so it won't ever move", but it should do no harm.

Perhaps we can put host1x code somewhere below drivers/gpu (mm
subdirectory?), drivers/memory or perhaps some other or new location
that could eventually host similar drivers for other SoCs.

Then again, maybe it'd be easier for now to put everything below the
drivers/gpu/drm/tegra directory and cross that bridge when we get to it.

quoted

I think that "coherent_pool" can be used only when the amount of
contiguous memory is short in your system. Otherwise even unnecessary.

Could you explain a bit more why you want carveout size on per-board basis?

In the ideal case I would want to not have a carveout size at all.
However there may be situations where you need to make sure some driver
can allocate a given amount of memory. Having to specify this using a
kernel command-line parameter is cumbersome because it may require
changes to the bootloader or whatever. So if you know that a particular
board always needs 128 MiB of carveout, then it makes sense to specify
it on a per-board basis.

If we go with CMA, this is a non-issue, as CMA allows to use the contig
area for normal allocations and only purges them if it really needs the
space for contig allocs.

CMA certainly sounds like the most simple approach. While it may not be
suited for 3D graphics or multimedia processing later on, I think we
could use it at a starting point to get basic framebuffer and X support
up and running. We can always move to something more advanced like TTM
later.

Thierry

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help