Re: [PATCH 0/2] Faster MMU lookups for Book3s v3

From: Alexander Graf <hidden>
Date: 2010-07-01 12:28:11
Also in: kvm

Avi Kivity wrote:

On 07/01/2010 01:00 PM, Alexander Graf wrote:

quoted

But doesn't that mean that you still need to loop through all the hvas
that you want to invalidate?

It does.

quoted

  Wouldn't it speed up dirty bitmap flushing
a lot if we'd just have a simple linked list of all sPTEs belonging to
that memslot?

The complexity is O(pages_in_slot) + O(sptes_for_slot).

Usually, every page is mapped at least once, so sptes_for_slot
dominates.  Even when it isn't so, iterating the rmap base pointers is
very fast since they are linear in memory, while sptes are scattered
around, causing cache misses.

Why would pages be mapped often? Don't you use lazy spte updates?

Another consideration is that on x86, an spte occupies just 64 bits
(for the hardware pte); if there are multiple sptes per page (rare on
modern hardware), there is also extra memory for rmap chains;
sometimes we also allocate 64 bits for the gfn.  Having an extra
linked list would require more memory to be allocated and maintained.

Hrm. I was thinking of not having an rmap but only using the chain. The
only slots that would require such a chain would be the ones with dirty
bitmapping enabled, so no penalty for normal RAM (unless you use kemari
or live migration of course).

But then again I probably do need an rmap for the mmu_notifier magic,
right? But I'd rather prefer to have that code path be slow and the
dirty bitmap invalidation fast than the other way around. Swapping is
slow either way.


Alex

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help