Thread (36 messages) 36 messages, 3 authors, 2014-12-15
STALE4196d

[PATCH 2/4] iommu: add ARM LPAE page table allocator

From: Will Deacon <hidden>
Date: 2014-12-05 18:48:38
Also in: linux-iommu

On Tue, Dec 02, 2014 at 11:47:36AM +0000, Laurent Pinchart wrote:
On Tuesday 02 December 2014 09:41:56 Will Deacon wrote:
quoted
On Mon, Dec 01, 2014 at 08:21:58PM +0000, Laurent Pinchart wrote:
quoted
On Monday 01 December 2014 17:23:15 Will Deacon wrote:
quoted
On Sun, Nov 30, 2014 at 11:29:46PM +0000, Laurent Pinchart wrote:
quoted
On Thursday 27 November 2014 11:51:16 Will Deacon wrote:
quoted
+     /* Looking good; allocate a pgd */
+     data->pgd = alloc_pages_exact(1UL << data->pg_shift,
+                                   GFP_KERNEL | __GFP_ZERO);
data->pg_shift is computed as __ffs(cfg->pgsize_bitmap). 1UL <<
data->pg_shift will thus be equal to the smallest page size supported
by the IOMMU. This will thus allocate 4kB, 16kB or 64kB depending on
the IOMMU configuration. However, if I'm not mistaken the top-level
directory needs to store one entry per largest supported page size.
That's 4, 128 or 8 entries depending on the configuration. You're thus
over-allocating.
Yeah, I'll take a closer look at this. The size of the pgd really
depends on the TxSZ configuration, which in turn depends on the ias and
the page size. There are also alignment constraints to bear in mind, but
I'll see what I can do (as it stands, over-allocating will always work).
Beside wasting memory, the code also doesn't reflect the requirements. It
works by chance, meaning it could break later.
It won't break, as the maximum size *is* bounded by a page for stage-1
and we already handle stage-2 concatenation correctly.
What I mean is that there's no correlation between the required size and the 
allocated size in the current code. It happens to work, but if the driver gets 
extended later to support more IOMMU configurations subtle bugs may crop up.
quoted
quoted
That's why I'd like to see this
being fixed. Can't the size be computed with something like

	size = (1 << (ias - data->levels * data->pg_shift))
	
	     * sizeof(arm_lpae_iopte);

(please add a proper detailed comment to explain the computation, as the
meaning is not straightforward)
That's actually the easy part. The harder part is getting the correct
alignment, which means managing by own kmem_cache on a per-ops basis. That
feels like overkill to me and we also need to make sure that we don't screw
up the case of concatenated pgds at stage-2. On top of that, since each
cache would be per-ops, I'm not even sure we'd save anything (the slab
allocators all operate on pages afaict).

If I use alloc_page_exact, we'll still have some wasteage, but it would
be less for the case where the CPU page size is smaller than the SMMU page
size. Do you think that's worth the extra complexity? We allocate full pages
at all levels after the pgd, so the wasteage is relatively small.

An alternative would be preinitialising some caches for `likely' pgd sizes,
but that's also horrible, especially if the kernel decides that it doesn't
need a bunch of the configurations at runtime.
How about just computing the right size, align it to a page size, and using 
alloc_page_exact ? The waste is small, so it doesn't justify anything more 
complex than that.
Ok, I'll have a go at that.

Will
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help