Thread (153 messages) 153 messages, 23 authors, 2023-05-23

Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM

From: Michael Roth <hidden>
Date: 2023-03-23 01:28:04
Also in: kvm, linux-arch, linux-doc, linux-fsdevel, linux-mm, lkml, qemu-devel

On Tue, Feb 21, 2023 at 08:11:35PM +0800, Chao Peng wrote:
quoted
Hi Sean,

We've rebased the SEV+SNP support onto your updated UPM base support
tree and things seem to be working okay, but we needed some fixups on
top of the base support get things working, along with 1 workaround
for an issue that hasn't been root-caused yet:

  https://github.com/mdroth/linux/commits/upmv10b-host-snp-v8-wip

  *stash (upm_base_support): mm: restrictedmem: Kirill's pinning implementation
  *workaround (use_base_support): mm: restrictedmem: loosen exclusivity check
What I'm seeing is Slot#3 gets added first and then deleted. When it's
gets added, Slot#0 already has the same range bound to restrictedmem so
trigger the exclusive check. This check is exactly the current code for.
With the following change in QEMU, we no longer trigger this check:

  diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
  index 20da121374..849b5de469 100644
  --- a/hw/pci-host/q35.c
  +++ b/hw/pci-host/q35.c
  @@ -588,9 +588,9 @@ static void mch_realize(PCIDevice *d, Error **errp)
       memory_region_init_alias(&mch->open_high_smram, OBJECT(mch), "smram-open-high",
                                mch->ram_memory, MCH_HOST_BRIDGE_SMRAM_C_BASE,
                                MCH_HOST_BRIDGE_SMRAM_C_SIZE);
  +    memory_region_set_enabled(&mch->open_high_smram, false);
       memory_region_add_subregion_overlap(mch->system_memory, 0xfeda0000,
                                           &mch->open_high_smram, 1);
  -    memory_region_set_enabled(&mch->open_high_smram, false);

I'm not sure if QEMU is actually doing something wrong here though or if
this check is putting tighter restrictions on userspace than what was
expected before. Will look into it more.
quoted
  *fixup (upm_base_support): KVM: use inclusive ranges for restrictedmem binding/unbinding
  *fixup (upm_base_support): mm: restrictedmem: use inclusive ranges for issuing invalidations
As many kernel APIs treat 'end' as exclusive, I would rather keep using
exclusive 'end' for these APIs(restrictedmem_bind/restrictedmem_unbind
and notifier callbacks) but fix it internally in the restrictedmem. E.g.
all the places where xarray API needs a 'last'/'max' we use 'end - 1'.
See below for the change.
Yes I did feel like I was fighting the kernel a bit on that; your
suggestion seems like it would be a better fit.
quoted
  *fixup (upm_base_support): KVM: fix restrictedmem GFN range calculations
Subtracting slot->restrictedmem.index for start/end in
restrictedmem_get_gfn_range() is the correct fix.
quoted
  *fixup (upm_base_support): KVM: selftests: CoCo compilation fixes

We plan to post an updated RFC for v8 soon, but also wanted to share
the staging tree in case you end up looking at the UPM integration aspects
before then.

-Mike
This is the restrictedmem fix to solve 'end' being stored and checked in xarray:
Looks good.

Thanks!

-Mike
quoted hunk ↗ jump to hunk
--- a/mm/restrictedmem.c
+++ b/mm/restrictedmem.c
@@ -46,12 +46,12 @@ static long restrictedmem_punch_hole(struct restrictedmem *rm, int mode,
         */
        down_read(&rm->lock);
 
-       xa_for_each_range(&rm->bindings, index, notifier, start, end)
+       xa_for_each_range(&rm->bindings, index, notifier, start, end - 1)
                notifier->ops->invalidate_start(notifier, start, end);
 
        ret = memfd->f_op->fallocate(memfd, mode, offset, len);
 
-       xa_for_each_range(&rm->bindings, index, notifier, start, end)
+       xa_for_each_range(&rm->bindings, index, notifier, start, end - 1)
                notifier->ops->invalidate_end(notifier, start, end);
 
        up_read(&rm->lock);
@@ -224,7 +224,7 @@ static int restricted_error_remove_page(struct address_space *mapping,
                }
                spin_unlock(&inode->i_lock);
 
-               xa_for_each_range(&rm->bindings, index, notifier, start, end)
+               xa_for_each_range(&rm->bindings, index, notifier, start, end - 1)
                        notifier->ops->error(notifier, start, end);
                break;
        }
@@ -301,11 +301,12 @@ int restrictedmem_bind(struct file *file, pgoff_t start, pgoff_t end,
                if (exclusive != rm->exclusive)
                        goto out_unlock;
 
-               if (exclusive && xa_find(&rm->bindings, &start, end, XA_PRESENT))
+               if (exclusive &&
+                   xa_find(&rm->bindings, &start, end - 1, XA_PRESENT))
                        goto out_unlock;
        }
 
-       xa_store_range(&rm->bindings, start, end, notifier, GFP_KERNEL);
+       xa_store_range(&rm->bindings, start, end - 1, notifier, GFP_KERNEL);
        rm->exclusive = exclusive;
        ret = 0;
 out_unlock:
@@ -320,7 +321,7 @@ void restrictedmem_unbind(struct file *file, pgoff_t start, pgoff_t end,
        struct restrictedmem *rm = file->f_mapping->private_data;
 
        down_write(&rm->lock);
-       xa_store_range(&rm->bindings, start, end, NULL, GFP_KERNEL);
+       xa_store_range(&rm->bindings, start, end - 1, NULL, GFP_KERNEL);
        synchronize_rcu();
        up_write(&rm->lock);
 }
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help