Thread (10 messages) 10 messages, 3 authors, 2020-06-29

Re: [PATCH v3 0/4] Migrate non-migrated pages of a SVM.

From: Ram Pai <hidden>
Date: 2020-06-29 19:31:19

On Mon, Jun 29, 2020 at 07:23:30AM +0530, Bharata B Rao wrote:
On Sun, Jun 28, 2020 at 09:41:53PM +0530, Bharata B Rao wrote:
quoted
On Fri, Jun 19, 2020 at 03:43:38PM -0700, Ram Pai wrote:
quoted
The time taken to switch a VM to Secure-VM, increases by the size of the VM.  A
100GB VM takes about 7minutes. This is unacceptable.  This linear increase is
caused by a suboptimal behavior by the Ultravisor and the Hypervisor.  The
Ultravisor unnecessarily migrates all the GFN of the VM from normal-memory to
secure-memory. It has to just migrate the necessary and sufficient GFNs.

However when the optimization is incorporated in the Ultravisor, the Hypervisor
starts misbehaving. The Hypervisor has a inbuilt assumption that the Ultravisor
will explicitly request to migrate, each and every GFN of the VM. If only
necessary and sufficient GFNs are requested for migration, the Hypervisor
continues to manage the remaining GFNs as normal GFNs. This leads of memory
corruption, manifested consistently when the SVM reboots.

The same is true, when a memory slot is hotplugged into a SVM. The Hypervisor
expects the ultravisor to request migration of all GFNs to secure-GFN.  But at
the same time, the hypervisor is unable to handle any H_SVM_PAGE_IN requests
from the Ultravisor, done in the context of UV_REGISTER_MEM_SLOT ucall.  This
problem manifests as random errors in the SVM, when a memory-slot is
hotplugged.

This patch series automatically migrates the non-migrated pages of a SVM,
     and thus solves the problem.
So this is what I understand as the objective of this patchset:

1. Getting all the pages into the secure memory right when the guest
   transitions into secure mode is expensive. Ultravisor wants to just get
   the necessary and sufficient pages in and put the onus on the Hypervisor
   to mark the remaining pages (w/o actual page-in) as secure during
   H_SVM_INIT_DONE.
2. During H_SVM_INIT_DONE, you want a way to differentiate the pages that
   are already secure from the pages that are shared and that are paged-out.
   For this you are introducing all these new states in HV.

UV knows about the shared GFNs and maintains the state of the same. Hence
let HV send all the pages (minus already secured pages) via H_SVM_PAGE_IN
and if UV finds any shared pages in them, let it fail the uv-page-in call.
Then HV can fail the migration for it  and the page continues to remain
shared. With this, you don't need to maintain a state for secured GFN in HV.

In the unlikely case of sending a paged-out page to UV during
H_SVM_INIT_DONE, let the page-in succeed and HV will fault on it again
if required. With this, you don't need a state in HV to identify a
paged-out-but-encrypted state.

Doesn't the above work?
I see that you want to infact skip the uv-page-in calls from H_SVM_INIT_DONE.
So that would need the extra states in HV which you are proposing here.
Yes. I want to skip to speed up the overall ESM switch.

RP
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help