Thread (305 messages) 305 messages, 22 authors, 2023-01-05

Re: [PATCH Part2 v6 14/49] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled

From: "jarkko@kernel.org" <jarkko@kernel.org>
Date: 2022-08-02 12:17:41
Also in: kvm, linux-crypto, linux-mm, lkml

On Tue, Jun 21, 2022 at 08:17:15PM +0000, Kalra, Ashish wrote:
[Public]

Hello Peter,
quoted
quoted
+static int snp_reclaim_pages(unsigned long pfn, unsigned int npages, 
+bool locked) {
+       struct sev_data_snp_page_reclaim data;
+       int ret, err, i, n = 0;
+
+       for (i = 0; i < npages; i++) {
quoted
What about setting |n| here too, also the other increments.
quoted
for (i = 0, n = 0; i < npages; i++, n++, pfn++)
Yes that is simpler.
quoted
quoted
+               memset(&data, 0, sizeof(data));
+               data.paddr = pfn << PAGE_SHIFT;
+
+               if (locked)
+                       ret = __sev_do_cmd_locked(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
+               else
+                       ret = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, 
+ &data, &err);
quoted
Can we change `sev_cmd_mutex` to some sort of nesting lock type? That could clean up this if (locked) code.
quoted
+static inline int rmp_make_firmware(unsigned long pfn, int level) {
+       return rmp_make_private(pfn, 0, level, 0, true); }
+
+static int snp_set_rmp_state(unsigned long paddr, unsigned int npages, bool to_fw, bool locked,
+                            bool need_reclaim)
quoted
This function can do a lot and when I read the call sites its hard to see what its doing since we have a combination of arguments which tell us what behavior is happening, some of which are not valid (ex: to_fw == true and need_reclaim == true is an >invalid argument combination).
to_fw is used to make a firmware page and need_reclaim is for freeing the firmware page, so they are going to be mutually exclusive. 

I actually can connect with it quite logically with the callers :
snp_alloc_firmware_pages will call with to_fw = true and need_reclaim = false
and snp_free_firmware_pages will do the opposite, to_fw = false and need_reclaim = true.

That seems straightforward to look at.
quoted
Also this for loop over |npages| is duplicated from snp_reclaim_pages(). One improvement here is that on the current
snp_reclaim_pages() if we fail to reclaim a page we assume we cannot reclaim the next pages, this may cause us to snp_leak_pages() more pages than we actually need too.
Yes that is true.
quoted
What about something like this?
quoted
static snp_leak_page(u64 pfn, enum pg_level level) {
  memory_failure(pfn, 0);
  dump_rmpentry(pfn);
}
quoted
static int snp_reclaim_page(u64 pfn, enum pg_level level) {
 int ret;
 struct sev_data_snp_page_reclaim data;
quoted
 ret = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
 if (ret)
   goto cleanup;
quoted
 ret = rmp_make_shared(pfn, level);
 if (ret)
   goto cleanup;
quoted
return 0;
quoted
cleanup:
   snp_leak_page(pfn, level)
}
quoted
typedef int (*rmp_state_change_func) (u64 pfn, enum pg_level level);
quoted
static int snp_set_rmp_state(unsigned long paddr, unsigned int npages, rmp_state_change_func state_change, rmp_state_change_func cleanup) {
 struct sev_data_snp_page_reclaim data;
 int ret, err, i, n = 0;
quoted
 for (i = 0, n = 0; i < npages; i++, n++, pfn++) {
   ret = state_change(pfn, PG_LEVEL_4K)
   if (ret)
     goto cleanup;
 }
quoted
 return 0;
quoted
cleanup:
 for (; i>= 0; i--, n--, pfn--) {
   cleanup(pfn, PG_LEVEL_4K);
 }
quoted
 return ret;
}
quoted
Then inside of __snp_alloc_firmware_pages():
quoted
snp_set_rmp_state(paddr, npages, rmp_make_firmware, snp_reclaim_page);
quoted
And inside of __snp_free_firmware_pages():
quoted
snp_set_rmp_state(paddr, npages, snp_reclaim_page, snp_leak_page);
quoted
Just a suggestion feel free to ignore. The readability comment could be addressed much less invasively by just making separate functions for each valid combination of arguments here. Like snp_set_rmp_fw_state(), snp_set_rmp_shared_state(),
snp_set_rmp_release_state() or something.
quoted
quoted
+static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int 
+order, bool locked) {
+       unsigned long npages = 1ul << order, paddr;
+       struct sev_device *sev;
+       struct page *page;
+
+       if (!psp_master || !psp_master->sev_data)
+               return NULL;
+
+       page = alloc_pages(gfp_mask, order);
+       if (!page)
+               return NULL;
+
+       /* If SEV-SNP is initialized then add the page in RMP table. */
+       sev = psp_master->sev_data;
+       if (!sev->snp_inited)
+               return page;
+
+       paddr = __pa((unsigned long)page_address(page));
+       if (snp_set_rmp_state(paddr, npages, true, locked, false))
+               return NULL;
quoted
So what about the case where snp_set_rmp_state() fails but we were able to reclaim all the pages? Should we be able to signal that to callers so that we could free |page| here? But given this is an error path already maybe we can optimize this in a >follow up series.
Yes, we should actually tie in to snp_reclaim_pages() success or failure here in the case we were able to successfully unroll some or all of the firmware state change.
quoted
+
+       return page;
+}
+
+void *snp_alloc_firmware_page(gfp_t gfp_mask) {
+       struct page *page;
+
+       page = __snp_alloc_firmware_pages(gfp_mask, 0, false);
+
+       return page ? page_address(page) : NULL; } 
+EXPORT_SYMBOL_GPL(snp_alloc_firmware_page);
+
+static void __snp_free_firmware_pages(struct page *page, int order, 
+bool locked) {
+       unsigned long paddr, npages = 1ul << order;
+
+       if (!page)
+               return;
+
+       paddr = __pa((unsigned long)page_address(page));
+       if (snp_set_rmp_state(paddr, npages, false, locked, true))
+               return;
quoted
Here we may be able to free some of |page| depending how where inside of snp_set_rmp_state() we failed. But again given this is an error path already maybe we can optimize this in a follow up series.
Yes, we probably should be able to free some of the page(s) depending on how many page(s) got reclaimed in snp_set_rmp_state().
But these reclamation failures may not be very common, so any failure is indicative of a bigger issue, it might be the case when there is a single page reclamation error it might happen with all the subsequent
pages and so follow a simple recovery procedure, then handling a more complex recovery for a chunk of pages being reclaimed and another chunk not. 
Silent ignore is stil a bad idea. I.e. at minimum would
make sense to print a warning to klog.
Thanks,
Ashish
BR, Jarkko
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help