Re: [PATCH 0/4] powernv: kvm: numa fault improvement
From: Liu ping fan <hidden>
Date: 2014-01-21 09:07:25
On Tue, Jan 21, 2014 at 11:40 AM, Aneesh Kumar K.V [off-list ref] wrote:
Liu ping fan [off-list ref] writes:quoted
On Mon, Jan 20, 2014 at 11:45 PM, Aneesh Kumar K.V [off-list ref] wrote:quoted
Liu ping fan [off-list ref] writes:quoted
On Thu, Jan 9, 2014 at 8:08 PM, Alexander Graf [off-list ref] wrote:quoted
On 11.12.2013, at 09:47, Liu Ping Fan [off-list ref] wrote:quoted
This series is based on Aneesh's series "[PATCH -V2 0/5] powerpc: mm: Numa faults support for ppc64" For this series, I apply the same idea from the previous thread "[PATCH 0/3] optimize for powerpc _PAGE_NUMA" (for which, I still try to get a machine to show nums) But for this series, I think that I have a good justification -- the fact of heavy cost when switching context between guest and host, which is well known.This cover letter isn't really telling me anything. Please put a proper description of what you're trying to achieve, why you're trying to achieve what you're trying and convince your readers that it's a good idea to do it the way you do it.Sorry for the unclear message. After introducing the _PAGE_NUMA, kvmppc_do_h_enter() can not fill up the hpte for guest. Instead, it should rely on host's kvmppc_book3s_hv_page_fault() to call do_numa_page() to do the numa fault check. This incurs the overhead when exiting from rmode to vmode. My idea is that in kvmppc_do_h_enter(), we do a quick check, if the page is right placed, there is no need to exit to vmode (i.e saving htab, slab switching)Can you explain more. Are we looking at hcall from guest and hypervisor handling them in real mode ? If so why would guest issue a hcall on a pte entry that have PAGE_NUMA set. Or is this about hypervisor handling a missing hpte, because of host swapping this page out ? In that case how we end up in h_enter ? IIUC for that case we should get to kvmppc_hpte_hv_fault.After setting _PAGE_NUMA, we should flush out all hptes both in host's htab and guest's. So when guest tries to access memory, host finds that there is not hpte ready for guest in guest's htab. And host should raise dsi to guest.Now guest receive that fault, removes the PAGE_NUMA bit and do an hpte_insert. So before we do an hpte_insert (or H_ENTER) we should have cleared PAGE_NUMA bit.quoted
This incurs that guest ends up in h_enter. And you can see in current code, we also try this quick path firstly. Only if fail, we will resort to slow path -- kvmppc_hpte_hv_fault.hmm ? hpte_hv_fault is the hypervisor handling the fault.
After we discuss in irc. I think we should also do the fast check in kvmppc_hpte_hv_fault() for the case of HPTE_V_ABSENT, and let H_ENTER take care of the rest case i.e. no hpte when pte_mknuma. Right? Thanks and regards, Fan
-aneesh