Thread (9 messages) 9 messages, 4 authors, 2014-05-02
STALE4420d

[PATCH] arm64: mm: Create gigabyte kernel logical mappings where possible

From: Steve Capper <hidden>
Date: 2014-05-01 08:54:12

On Wed, Apr 30, 2014 at 08:11:26PM +0200, Arnd Bergmann wrote:
On Wednesday 30 April 2014 12:36:22 Steve Capper wrote:
quoted
We have the capability to map 1GB level 1 blocks when using a 4K
granule.

This patch adjusts the create_mapping logic s.t. when mapping physical
memory on boot, we attempt to use a 1GB block if both the VA and PA
start and end are 1GB aligned. This both reduces the levels of lookup
required to resolve a kernel logical address, as well as reduces TLB
pressure on cores that support 1GB TLB entries.

Signed-off-by: Steve Capper <redacted>
---
Hello,
This patch has been tested on the FastModel for 4K and 64K pages.
Also, this has been tested with Jungseok's 4 level patch.

I put in the explicit check for PAGE_SHIFT, as I am anticipating a
three level 64KB configuration at some point.

With two level 64K, a PUD is equivalent to a PMD which is equivalent to
a PGD, and these are all level 2 descriptors.

Under three level 64K, a PUD would be equivalent to a PGD which would
be a level 1 descriptor thus may not be a block.

Comments/critique/testers welcome.
It seems like a great idea. I have to admit that I don't understand
the existing code, but what are the page sizes used here?
Actually, I think it was your idea ;-). I remember you talking about
increasing the mapping size when 4-level page tables were being
discussed. (I think I should have added a Reported-by, would be happy
to if you want?).

With a 64KB granule, we'll map 512MB blocks if possible, otherwise 64K.
And with a 4KB granule, the original code will map 2MB blocks if
possible, and 4KB otherwise.

The patch will make the 4KB granule case also map 1GB blocks if
possible.
Does the code always use the largest possible page size, or does
it just use either small pages or 1G pages?
The code will put down the largest mappings it can. As the physical
memory sizes/address are very likely to be aligned to whatever block
size we use; we are likely to achieve the maximum size for our
mappings.
In combination with the contiguous page hint, we should be able
to theoretically support 4KB/64KB/2M/32M/1G/16G TLBs in any
combination for boot-time mappings on a 4K page size kernel,
or 64KB/1M/512M/8G on a 64KB page size kernel.
A contiguous hint could be applied to these mappings. The logic would
be a bit more complicated though when we consider different granules.
For 4KB we chain together 16 entries, for 64KB we use 32. If/when we
adopt a 16KB granule, we use 32 entries for a level 2 lookup and
128 entries for a level 3 lookup...

The largest TLB entry sizes that I am aware of in play are the block
sizes (i.e. 2MB, 512MB, 1GB). So I don't think we'll get any benefit at
the moment for adding the contiguous logic.

Cheers,
-- 
Steve
	Arnd
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help