Re: [PATCH v11 1/7] mm: ksm: Export ksm_madvise()
From: Bharata B Rao <hidden>
Date: 2019-11-27 06:53:16
Also in:
linux-mm
On Tue, Nov 26, 2019 at 07:59:49PM -0800, Hugh Dickins wrote:
On Mon, 25 Nov 2019, Bharata B Rao wrote:quoted
On PEF-enabled POWER platforms that support running of secure guests, secure pages of the guest are represented by device private pages in the host. Such pages needn't participate in KSM merging. This is achieved by using ksm_madvise() call which need to be exported since KVM PPC can be a kernel module. Signed-off-by: Bharata B Rao <redacted> Acked-by: Paul Mackerras <redacted> Cc: Andrea Arcangeli <redacted> Cc: Hugh Dickins <hughd@google.com>I can say Acked-by: Hugh Dickins <hughd@google.com> to this one. But not to your 2/7 which actually makes use of it: because sadly it needs down_write(&kvm->mm->mmap_sem) for the case when it switches off VM_MERGEABLE in vma->vm_flags. That's frustrating, since I think it's the only operation for which down_read() is not good enough.
Oh ok! Thanks for pointing this out.
I have no idea how contended that mmap_sem is likely to be, nor how many to-be-secured pages that vma is likely to contain: you might find it okay simply to go with it down_write throughout, or you might want to start out with it down_read, and only restart with down_write (then perhaps downgrade_write later) when you see VM_MERGEABLE is set.
Using down_write throughtout is not easy as we do migrate_vma_pages() from fault path (->migrate_to_ram()) too. Here we come with down_read already held. Starting with down_read and restarting with down_write if VM_MERGEABLE is set -- this also looks a bit difficult as we will have challenges with locking order if we release mmap_sem in between and re-acquire. So I think I will start with down_write in this particular case and will downgrade_write as soon as ksm_madvise() is complete.
The crash you got (thanks for the link): that will be because your migrate_vma_pages() had already been applied to a page that was already being shared via KSM. But if these secure pages are expected to be few and far between, maybe you'd prefer to keep VM_MERGEABLE, and add per-page checks of some kind into mm/ksm.c, to skip over these surprising hybrids.
I did bail out from a few routines in mm/ksm.c with is_device_private_page(page) check, but that wasn't good enough and I encountered crashes in different code paths. Guess a bit more understanding of KSM internals would be required before retrying that. However since all the pages of the guest except for a few will be turned into secure pages early during boot, it appears better if secure guests don't participate in in KSM merging at all. Regards, Bharata.