Re: [PATCH v10 9/9] KVM: Enable and expose KVM_MEM_PRIVATE

[PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Chao Peng <hidden> · 2022-12-02
[PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Chao Peng <hidden> · 2022-12-02
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Fuad Tabba <hidden> · 2022-12-06
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Chao Peng <hidden> · 2022-12-07
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Huang, Kai <hidden> · 2022-12-13
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Chao Peng <hidden> · 2022-12-19
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Huang, Kai <hidden> · 2022-12-19
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Chao Peng <hidden> · 2022-12-20
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Huang, Kai <hidden> · 2022-12-20
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Chao Peng <hidden> · 2022-12-21
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Huang, Kai <hidden> · 2022-12-22
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Chao Peng <hidden> · 2022-12-23
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Vlastimil Babka <hidden> · 2023-01-23
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Kirill A. Shutemov <hidden> · 2023-01-23
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Vlastimil Babka <hidden> · 2023-02-13
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Huang, Kai <hidden> · 2023-01-23
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Sean Christopherson <seanjc@google.com> · 2023-01-23
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Vlastimil Babka <hidden> · 2023-01-24
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Sean Christopherson <seanjc@google.com> · 2022-12-22
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Huang, Kai <hidden> · 2022-12-23
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Chao Peng <hidden> · 2022-12-23
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Kirill A. Shutemov <hidden> · 2023-01-23
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Vlastimil Babka <hidden> · 2023-02-13
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Michael Roth <hidden> · 2023-02-13
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Sean Christopherson <seanjc@google.com> · 2023-01-13
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Chao Peng <hidden> · 2023-01-17
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Sean Christopherson <seanjc@google.com> · 2023-01-17
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Chao Peng <hidden> · 2023-01-18
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Isaku Yamahata <hidden> · 2023-01-18
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Alexey Kardashevskiy <hidden> · 2023-02-22
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Chao Peng <hidden> · 2023-02-24
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Nikunj A. Dadhania <hidden> · 2023-02-16
Re: [PATCH v10 1/9] mm: Introduce memfd_restricted system call to create restricted user memory · Michael Roth <hidden> · 2023-03-20
[PATCH v10 2/9] KVM: Introduce per-page memory attributes · Chao Peng <hidden> · 2022-12-02
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Fabiano Rosas <hidden> · 2022-12-06
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Chao Peng <hidden> · 2022-12-07
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Fuad Tabba <hidden> · 2022-12-06
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Chao Peng <hidden> · 2022-12-07
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Borislav Petkov <bp@alien8.de> · 2022-12-16
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Chao Peng <hidden> · 2022-12-19
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Borislav Petkov <bp@alien8.de> · 2022-12-19
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Chao Peng <hidden> · 2022-12-20
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Chenyi Qiang <hidden> · 2022-12-28
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Chao Peng <hidden> · 2023-01-03
RE: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Wang, Wei W <hidden> · 2023-01-03
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Sean Christopherson <seanjc@google.com> · 2023-01-03
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Chao Peng <hidden> · 2023-01-05
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Sean Christopherson <seanjc@google.com> · 2023-01-13
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Binbin Wu <hidden> · 2023-01-17
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Chao Peng <hidden> · 2023-01-17
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Sean Christopherson <seanjc@google.com> · 2023-01-17
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Isaku Yamahata <hidden> · 2023-02-09
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Sean Christopherson <seanjc@google.com> · 2023-02-10
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Isaku Yamahata <hidden> · 2023-02-13
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Sean Christopherson <seanjc@google.com> · 2023-02-14
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Nicolas Saenz Julienne <hidden> · 2023-05-19
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Sean Christopherson <seanjc@google.com> · 2023-05-19
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Nicolas Saenz Julienne <hidden> · 2023-05-19
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Sean Christopherson <seanjc@google.com> · 2023-05-19
Re: [PATCH v10 2/9] KVM: Introduce per-page memory attributes · Nicolas Saenz Julienne <hidden> · 2023-05-23
[PATCH v10 4/9] KVM: Add KVM_EXIT_MEMORY_FAULT exit · Chao Peng <hidden> · 2022-12-02
Re: [PATCH v10 4/9] KVM: Add KVM_EXIT_MEMORY_FAULT exit · Fuad Tabba <hidden> · 2022-12-06
Re: [PATCH v10 4/9] KVM: Add KVM_EXIT_MEMORY_FAULT exit · Chao Peng <hidden> · 2022-12-07
Re: [PATCH v10 4/9] KVM: Add KVM_EXIT_MEMORY_FAULT exit · Sean Christopherson <seanjc@google.com> · 2023-01-13
[PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2022-12-02
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Fuad Tabba <hidden> · 2022-12-05
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2022-12-06
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Fuad Tabba <hidden> · 2022-12-06
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2022-12-07
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Xiaoyao Li <hidden> · 2022-12-08
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2022-12-08
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Xiaoyao Li <hidden> · 2022-12-13
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2022-12-19
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Borislav Petkov <bp@alien8.de> · 2022-12-19
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2022-12-20
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Borislav Petkov <bp@alien8.de> · 2022-12-20
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2022-12-21
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2023-01-05
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2023-01-06
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Sean Christopherson <seanjc@google.com> · 2023-01-09
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2023-01-10
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Vishal Annapurve <hidden> · 2023-01-10
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Sean Christopherson <seanjc@google.com> · 2023-01-13
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Chao Peng <hidden> · 2023-01-17
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2023-01-20
Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory · Jarkko Sakkinen <jarkko@kernel.org> · 2023-01-20
[PATCH v10 5/9] KVM: Use gfn instead of hva for mmu_notifier_retry · Chao Peng <hidden> · 2022-12-02
Re: [PATCH v10 5/9] KVM: Use gfn instead of hva for mmu_notifier_retry · Fuad Tabba <hidden> · 2022-12-05
Re: [PATCH v10 5/9] KVM: Use gfn instead of hva for mmu_notifier_retry · Chao Peng <hidden> · 2022-12-06
Re: [PATCH v10 5/9] KVM: Use gfn instead of hva for mmu_notifier_retry · Fuad Tabba <hidden> · 2022-12-06
Re: [PATCH v10 5/9] KVM: Use gfn instead of hva for mmu_notifier_retry · Chao Peng <hidden> · 2022-12-09
Re: [PATCH v10 5/9] KVM: Use gfn instead of hva for mmu_notifier_retry · Isaku Yamahata <hidden> · 2022-12-07
Re: [PATCH v10 5/9] KVM: Use gfn instead of hva for mmu_notifier_retry · Chao Peng <hidden> · 2022-12-07
[PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes · Chao Peng <hidden> · 2022-12-02
Re: [PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes · Yuan Yao <hidden> · 2022-12-07
Re: [PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes · Chao Peng <hidden> · 2022-12-08
Re: [PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes · Yuan Yao <hidden> · 2022-12-09
Re: [PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes · Fuad Tabba <hidden> · 2022-12-07
Re: [PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes · Chao Peng <hidden> · 2022-12-08
Re: [PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes · Fuad Tabba <hidden> · 2022-12-09
Re: [PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes · Chao Peng <hidden> · 2022-12-12
Re: [PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes · Huang, Kai <hidden> · 2022-12-13
Re: [PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes · Chao Peng <hidden> · 2022-12-19
Re: [PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes · Sean Christopherson <seanjc@google.com> · 2023-01-13
[PATCH v10 7/9] KVM: Update lpage info when private/shared memory are mixed · Chao Peng <hidden> · 2022-12-02
Re: [PATCH v10 7/9] KVM: Update lpage info when private/shared memory are mixed · Isaku Yamahata <hidden> · 2022-12-05
Re: [PATCH v10 7/9] KVM: Update lpage info when private/shared memory are mixed · Chao Peng <hidden> · 2022-12-06
Re: [PATCH v10 7/9] KVM: Update lpage info when private/shared memory are mixed · Isaku Yamahata <hidden> · 2022-12-07
Re: [PATCH v10 7/9] KVM: Update lpage info when private/shared memory are mixed · Chao Peng <hidden> · 2022-12-08
Re: [PATCH v10 7/9] KVM: Update lpage info when private/shared memory are mixed · Sean Christopherson <seanjc@google.com> · 2023-01-13
Re: [PATCH v10 7/9] KVM: Update lpage info when private/shared memory are mixed · Sean Christopherson <seanjc@google.com> · 2023-01-13
Re: [PATCH v10 7/9] KVM: Update lpage info when private/shared memory are mixed · Chao Peng <hidden> · 2023-01-28
[PATCH v10 8/9] KVM: Handle page fault for private memory · Chao Peng <hidden> · 2022-12-02
Re: [PATCH v10 8/9] KVM: Handle page fault for private memory · Yuan Yao <hidden> · 2022-12-08
Re: [PATCH v10 8/9] KVM: Handle page fault for private memory · Chao Peng <hidden> · 2022-12-08
Re: [PATCH v10 8/9] KVM: Handle page fault for private memory · Yuan Yao <hidden> · 2022-12-09
Re: [PATCH v10 8/9] KVM: Handle page fault for private memory · Fuad Tabba <hidden> · 2022-12-09
Re: [PATCH v10 8/9] KVM: Handle page fault for private memory · Chao Peng <hidden> · 2022-12-12
Re: [PATCH v10 8/9] KVM: Handle page fault for private memory · Sean Christopherson <seanjc@google.com> · 2023-01-13
[PATCH v10 9/9] KVM: Enable and expose KVM_MEM_PRIVATE · Chao Peng <hidden> · 2022-12-02
Re: [PATCH v10 9/9] KVM: Enable and expose KVM_MEM_PRIVATE · Fuad Tabba <hidden> · 2022-12-09
Re: [PATCH v10 9/9] KVM: Enable and expose KVM_MEM_PRIVATE · Vishal Annapurve <hidden> · 2023-01-05
Re: [PATCH v10 9/9] KVM: Enable and expose KVM_MEM_PRIVATE · Chao Peng <hidden> · 2023-01-06
Re: [PATCH v10 9/9] KVM: Enable and expose KVM_MEM_PRIVATE · Sean Christopherson <seanjc@google.com> · 2023-01-14
Re: [PATCH v10 9/9] KVM: Enable and expose KVM_MEM_PRIVATE · Chao Peng <hidden> · 2023-01-17
Re: [PATCH v10 9/9] KVM: Enable and expose KVM_MEM_PRIVATE · Sean Christopherson <seanjc@google.com> · 2023-01-17
Re: [PATCH v10 9/9] KVM: Enable and expose KVM_MEM_PRIVATE · Chao Peng <hidden> · 2023-01-18
Re: [PATCH v10 9/9] KVM: Enable and expose KVM_MEM_PRIVATE · Chao Peng <hidden> · 2023-01-28
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Sean Christopherson <seanjc@google.com> · 2023-01-14
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Kirill A. Shutemov <hidden> · 2023-01-16
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Chao Peng <hidden> · 2023-01-17
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Fuad Tabba <hidden> · 2023-01-17
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Isaku Yamahata <hidden> · 2023-01-19
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Sean Christopherson <seanjc@google.com> · 2023-01-19
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Isaku Yamahata <hidden> · 2023-01-19
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Sean Christopherson <seanjc@google.com> · 2023-01-24
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Isaku Yamahata <hidden> · 2023-02-08
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Michael Roth <hidden> · 2023-02-13
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Chao Peng <hidden> · 2023-02-21
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Michael Roth <hidden> · 2023-03-23
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Chao Peng <hidden> · 2023-03-24
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Sean Christopherson <seanjc@google.com> · 2023-04-12
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Chao Peng <hidden> · 2023-04-17
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Sean Christopherson <seanjc@google.com> · 2023-04-17
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Liam Merwick <hidden> · 2023-01-24
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Sean Christopherson <seanjc@google.com> · 2023-01-25
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Kirill A. Shutemov <hidden> · 2023-01-25
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Liam Merwick <hidden> · 2023-01-25
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Sean Christopherson <seanjc@google.com> · 2023-04-13
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Kirill A. Shutemov <hidden> · 2023-04-13
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Mike Rapoport <rppt@kernel.org> · 2023-02-16
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · David Hildenbrand <hidden> · 2023-02-16
Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM · Sean Christopherson <seanjc@google.com> · 2023-02-22

From: Chao Peng <hidden>
Date: 2023-01-18 09:15:44
Also in: kvm, linux-arch, linux-doc, linux-fsdevel, linux-mm, lkml, qemu-devel

On Tue, Jan 17, 2023 at 07:35:58PM +0000, Sean Christopherson wrote:

On Tue, Jan 17, 2023, Chao Peng wrote:

quoted

On Sat, Jan 14, 2023 at 12:01:01AM +0000, Sean Christopherson wrote:

quoted

On Fri, Dec 02, 2022, Chao Peng wrote:

quoted

@@ -10357,6 +10364,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 
 		if (kvm_check_request(KVM_REQ_UPDATE_CPU_DIRTY_LOGGING, vcpu))
 			static_call(kvm_x86_update_cpu_dirty_logging)(vcpu);
+
+		if (kvm_check_request(KVM_REQ_MEMORY_MCE, vcpu)) {
+			vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;

Synthesizing triple fault shutdown is not the right approach.  Even with TDX's
MCE "architecture" (heavy sarcasm), it's possible that host userspace and the
guest have a paravirt interface for handling memory errors without killing the
host.

Agree shutdown is not the correct choice. I see you made below change:

send_sig_mceerr(BUS_MCEERR_AR, (void __user *)hva, PAGE_SHIFT, current)

The MCE may happen in any thread than KVM thread, sending siginal to
'current' thread may not be the expected behavior.

This is already true today, e.g. a #MC in memory that is mapped into the guest can
be triggered by a host access.  Hrm, but in this case we actually have a KVM
instance, and we know that the #MC is relevant to the KVM instance, so I agree
that signaling 'current' is kludgy.

quoted

 Also how userspace can tell is the MCE on the shared page or private page?
 Do we care?

We care.  I was originally thinking we could require userspace to keep track of
things, but that's quite prescriptive and flawed, e.g. could race with conversions.

One option would be to KVM_EXIT_MEMORY_FAULT, and then wire up a generic (not x86
specific) KVM request to exit to userspace, e.g.

		/* KVM_EXIT_MEMORY_FAULT */
		struct {
#define KVM_MEMORY_EXIT_FLAG_PRIVATE	(1ULL << 3)
#define KVM_MEMORY_EXIT_FLAG_HW_ERROR	(1ULL << 4)
			__u64 flags;
			__u64 gpa;
			__u64 size;
		} memory;

But I'm not sure that's the correct approach.  It kinda feels like we're reinventing
the wheel.  It seems like restrictedmem_get_page() _must_ be able to reject attempts
to get a poisoned page, i.e. restrictedmem_get_page() should yield KVM_PFN_ERR_HWPOISON.

Yes, I see there is -EHWPOISON handling for hva_to_pfn() for shared
memory. It makes sense doing similar for private page.

Assuming that's the case, then I believe KVM simply needs to zap SPTEs in response
to an error notification in order to force vCPUs to fault on the poisoned page.

Agree, this is waht we should do anyway.

quoted

+		return -EINVAL;
 	if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_MEM_SLOTS_NUM)
 		return -EINVAL;
 	if (mem->guest_phys_addr + mem->memory_size < mem->guest_phys_addr)

@@ -2020,6 +2154,9 @@ int __kvm_set_memory_region(struct kvm *kvm,
 		if ((kvm->nr_memslot_pages + npages) < kvm->nr_memslot_pages)
 			return -EINVAL;
 	} else { /* Modify an existing slot. */
+		/* Private memslots are immutable, they can only be deleted. */

I'm 99% certain I suggested this, but if we're going to make these memslots
immutable, then we should straight up disallow dirty logging, otherwise we'll
end up with a bizarre uAPI.

But in my mind dirty logging will be needed in the very short time, when
live migration gets supported?

Ya, but if/when live migration support is added, private memslots will no longer
be immutable as userspace will want to enable dirty logging only when a VM is
being migrated, i.e. something will need to change.

Given that it looks like we have clear line of sight to SEV+UPM guests, my
preference would be to allow toggling dirty logging from the get-go.  It doesn't
necessarily have to be in the first patch, e.g. KVM could initially reject
KVM_MEM_LOG_DIRTY_PAGES + KVM_MEM_PRIVATE and then add support separately to make
the series easier to review, test, and bisect.

static int check_memory_region_flags(struct kvm *kvm,
				     const struct kvm_userspace_memory_region2 *mem)
{
	u32 valid_flags = KVM_MEM_LOG_DIRTY_PAGES;

	if (kvm_arch_has_private_mem(kvm) &&
	    ~(mem->flags & KVM_MEM_LOG_DIRTY_PAGES))
		valid_flags |= KVM_MEM_PRIVATE;

Adding this limitation is OK to me. It's not too hard to remove it when
live migration gets added.


	...
}

quoted

+		if (mem->flags & KVM_MEM_PRIVATE)
+			return -EINVAL;
 		if ((mem->userspace_addr != old->userspace_addr) ||
 		    (npages != old->npages) ||
 		    ((mem->flags ^ old->flags) & KVM_MEM_READONLY))

@@ -2048,10 +2185,28 @@ int __kvm_set_memory_region(struct kvm *kvm,
 	new->npages = npages;
 	new->flags = mem->flags;
 	new->userspace_addr = mem->userspace_addr;
+	if (mem->flags & KVM_MEM_PRIVATE) {
+		new->restricted_file = fget(mem->restricted_fd);
+		if (!new->restricted_file ||
+		    !file_is_restrictedmem(new->restricted_file)) {
+			r = -EINVAL;
+			goto out;
+		}
+		new->restricted_offset = mem->restricted_offset;

I see you changed slot->restricted_offset type from loff_t to gfn_t and
used pgoff_t when doing the restrictedmem_bind/unbind(). Using page
index is reasonable KVM internally and sounds simpler than loff_t. But
we also need initialize it to page index here as well as changes in
another two cases. This is needed when restricted_offset != 0.

Oof.  I'm pretty sure I completely missed that loff_t is used for byte offsets,
whereas pgoff_t is a frame index. 

Given that the restrictmem APIs take pgoff_t, I definitely think it makes sense
to the index, but I'm very tempted to store pgoff_t instead of gfn_t, and name
the field "index" to help connect the dots to the rest of kernel, where "pgoff_t index"
is quite common.

And looking at those bits again, we should wrap all of the restrictedmem fields
with CONFIG_KVM_PRIVATE_MEM.  It'll require minor tweaks to __kvm_set_memory_region(),
but I think will yield cleaner code (and internal APIs) overall.

And wrap the three fields in an anonymous struct?  E.g. this is a little more
versbose (restrictedmem instead restricted), but at first glance it doesn't seem
to cause widespared line length issues.

#ifdef CONFIG_KVM_PRIVATE_MEM
	struct {
		struct file *file;
		pgoff_t index;
		struct restrictedmem_notifier notifier;
	} restrictedmem;
#endif

Looks better.

Thanks,
Chao

quoted

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 547b92215002..49e375e78f30 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h

@@ -2364,8 +2364,7 @@ static inline int kvm_restricted_mem_get_pfn(struct kvm_memory_slot *slot,
                                             gfn_t gfn, kvm_pfn_t *pfn,
                                             int *order)
 {
-       pgoff_t index = gfn - slot->base_gfn +
-                       (slot->restricted_offset >> PAGE_SHIFT);
+       pgoff_t index = gfn - slot->base_gfn + slot->restricted_offset;
        struct page *page;
        int ret;

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 01db35ddd5b3..7439bdcb0d04 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c

@@ -935,7 +935,7 @@ static bool restrictedmem_range_is_valid(struct kvm_memory_slot *slot,
                                         pgoff_t start, pgoff_t end,
                                         gfn_t *gfn_start, gfn_t *gfn_end)
 {
-       unsigned long base_pgoff = slot->restricted_offset >> PAGE_SHIFT;
+       unsigned long base_pgoff = slot->restricted_offset;
 
        if (start > base_pgoff)
                *gfn_start = slot->base_gfn + start - base_pgoff;

@@ -2275,7 +2275,7 @@ int __kvm_set_memory_region(struct kvm *kvm,
                        r = -EINVAL;
                        goto out;
                }
-               new->restricted_offset = mem->restricted_offset;
+               new->restricted_offset = mem->restricted_offset >> PAGE_SHIFT;
        }
 
        r = kvm_set_memslot(kvm, old, new, change);

Chao

quoted

+	}
+
+	new->kvm = kvm;

Set this above, just so that the code flows better.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help