Re: [PATCH v3 00/14] KVM: s390: pv: implement lazy destroy

[PATCH v3 00/14] KVM: s390: pv: implement lazy destroy · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 01/14] KVM: s390: pv: add macros for UVC CC values · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
Re: [PATCH v3 01/14] KVM: s390: pv: add macros for UVC CC values · David Hildenbrand <hidden> · 2021-08-06
Re: [PATCH v3 01/14] KVM: s390: pv: add macros for UVC CC values · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-06
Re: [PATCH v3 01/14] KVM: s390: pv: add macros for UVC CC values · Janosch Frank <frankja@linux.ibm.com> · 2021-08-06
[PATCH v3 02/14] KVM: s390: pv: avoid stall notifications for some UVCs · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
Re: [PATCH v3 02/14] KVM: s390: pv: avoid stall notifications for some UVCs · David Hildenbrand <hidden> · 2021-08-06
Re: [PATCH v3 02/14] KVM: s390: pv: avoid stall notifications for some UVCs · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-06
[PATCH v3 04/14] KVM: s390: pv: properly handle page flags for protected guests · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 08/14] KVM: s390: pv: usage counter instead of flag · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 10/14] KVM: s390: pv: lazy destroy for reboot · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 09/14] KVM: s390: pv: add export before import · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 11/14] KVM: s390: pv: extend lazy destroy to handle shutdown · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 12/14] KVM: s390: pv: module parameter to fence lazy destroy · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 13/14] KVM: s390: pv: add OOM notifier for lazy destroy · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 14/14] KVM: s390: pv: avoid export before import if possible · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 07/14] KVM: s390: pv: refactor s390_reset_acc · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 06/14] KVM: s390: pv: handle secure storage exceptions for normal guests · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 05/14] KVM: s390: pv: handle secure storage violations for protected guests · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
[PATCH v3 03/14] KVM: s390: pv: leak the ASCE page when destroy fails · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-04
Re: [PATCH v3 03/14] KVM: s390: pv: leak the ASCE page when destroy fails · David Hildenbrand <hidden> · 2021-08-06
Re: [PATCH v3 03/14] KVM: s390: pv: leak the ASCE page when destroy fails · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-06
Re: [PATCH v3 03/14] KVM: s390: pv: leak the ASCE page when destroy fails · David Hildenbrand <hidden> · 2021-08-06
Re: [PATCH v3 00/14] KVM: s390: pv: implement lazy destroy · David Hildenbrand <hidden> · 2021-08-06
Re: [PATCH v3 00/14] KVM: s390: pv: implement lazy destroy · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-06
Re: [PATCH v3 00/14] KVM: s390: pv: implement lazy destroy · David Hildenbrand <hidden> · 2021-08-06
Re: [PATCH v3 00/14] KVM: s390: pv: implement lazy destroy · Claudio Imbrenda <imbrenda@linux.ibm.com> · 2021-08-06
Re: [PATCH v3 00/14] KVM: s390: pv: implement lazy destroy · David Hildenbrand <hidden> · 2021-08-09

From: David Hildenbrand <hidden>
Date: 2021-08-06 07:10:35
Also in: linux-mm, linux-s390, lkml

On 04.08.21 17:40, Claudio Imbrenda wrote:

Previously, when a protected VM was rebooted or when it was shut down,
its memory was made unprotected, and then the protected VM itself was
destroyed. Looping over the whole address space can take some time,
considering the overhead of the various Ultravisor Calls (UVCs). This
means that a reboot or a shutdown would take a potentially long amount
of time, depending on the amount of used memory.

This patchseries implements a deferred destroy mechanism for protected
guests. When a protected guest is destroyed, its memory is cleared in
background, allowing the guest to restart or terminate significantly
faster than before.

There are 2 possibilities when a protected VM is torn down:
* it still has an address space associated (reboot case)
* it does not have an address space anymore (shutdown case)

For the reboot case, the reference count of the mm is increased, and
then a background thread is started to clean up. Once the thread went
through the whole address space, the protected VM is actually
destroyed.

That doesn't sound too hacky to me, and actually sounds like a good 
idea, doing what the guest would do either way but speeding it up 
asynchronously, but ...

For the shutdown case, a list of pages to be destroyed is formed when
the mm is torn down. Instead of just unmapping the pages when the
address space is being torn down, they are also set aside. Later when
KVM cleans up the VM, a thread is started to clean up the pages from
the list.

... this ...

This means that the same address space can have memory belonging to
more than one protected guest, although only one will be running, the
others will in fact not even have any CPUs.

... this ...

When a guest is destroyed, its memory still counts towards its memory
control group until it's actually freed (I tested this experimentally)

When the system runs out of memory, if a guest has terminated and its
memory is being cleaned asynchronously, the OOM killer will wait a
little and then see if memory has been freed. This has the practical
effect of slowing down memory allocations when the system is out of
memory to give the cleanup thread time to cleanup and free memory, and
avoid an actual OOM situation.

... and this sound like the kind of arch MM hacks that will bite us in 
the long run. Of course, I might be wrong, but already doing excessive 
GFP_ATOMIC allocations or messing with the OOM killer that way for a 
pure (shutdown) optimization is an alarm signal. Of course, I might be 
wrong.

You should at least CC linux-mm. I'll do that right now and also CC 
Michal. He might have time to have a quick glimpse at patch #11 and #13.

https://lkml.kernel.org/r/20210804154046.88552-12-imbrenda@linux.ibm.com
https://lkml.kernel.org/r/20210804154046.88552-14-imbrenda@linux.ibm.com

IMHO, we should proceed with patch 1-10, as they solve a really 
important problem ("slow reboots") in a nice way, whereby patch 11 
handles a case that can be worked around comparatively easily by 
management tools -- my 2 cents.

-- 
Thanks,

David / dhildenb

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help