Thread (305 messages) 305 messages, 22 authors, 2023-01-05

Re: [PATCH Part2 v6 14/49] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled

From: "Kalra, Ashish" <ashish.kalra@amd.com>
Date: 2022-11-22 00:38:14
Also in: kvm, linux-crypto, linux-mm, lkml

Hello Boris,

On 11/20/2022 3:34 PM, Borislav Petkov wrote:
On Thu, Nov 17, 2022 at 02:56:47PM -0600, Kalra, Ashish wrote:
quoted
So we need to be able to reclaim all the pages or none.
/me goes and looks at SNP_PAGE_RECLAIM's retvals:

- INVALID_PLATFORM_STATE - platform is not in INIT state. That's
certainly not a reason to leak pages.
This should not happen, as there are sev->snp_initialized checks before
any firmware page allocation or snp page transitions.
- INVALID_ADDRESS - PAGE_PADDR is not a valid system physical address.
That's botched command buffer but not a broken page so no reason to leak
them either.

- INVALID_PAGE_STATE - the page is neither of those types: metadata,
firmware, pre-guest nor pre-swap. So if you issue page reclaim on the
wrong range of pages that looks again like a user error but no need to
leak pages.

- INVALID_PAGE_SIZE - a size mismatch. Still sounds to me like a user
error of sev-guest instead of anything wrong deeper in the FW or HW.

So in all those, if you end up supplying the wrong range of addresses,
you most certainly will end up leaking the wrong pages.

So it sounds to me like you wanna say: "Error reclaiming range, check
your driver" instead of punishing any innocent pages.
I agree, but these pages are not in the right state to be released back 
to the system or accessed by the host, because they have already been 
transitioned successfully to firmware state and the reclaim has failed. 
If we release them back to page-allocator and whenever the host accesses 
them, it will get a not-present #PF and it will panic/crash the host 
process.

It might be a user/sev-guest error, but these pages are now unsafe to 
use. So is a kernel panic justified here, instead of not releasing the 
pages back to host and logging errors for the same.

Thanks,
Ashish
Now, if the retval from the fw were FIRMWARE_INTERNAL_ERROR or so, then
sure, by all means. But not for the above. All the error conditions
above sound like the kernel has supplied the wrong range/botched command
buffer to the firmware so there's no need to leak pages.

Thx.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help