Re: [PATCH v9 41/43] virt: Add SEV-SNP guest driver
From: Borislav Petkov <bp@alien8.de>
Date: 2022-02-06 22:39:54
Also in:
kvm, linux-efi, linux-mm, lkml, platform-driver-x86
Subsystem:
documentation, the rest · Maintainers:
Jonathan Corbet, Linus Torvalds
On Fri, Jan 28, 2022 at 11:18:02AM -0600, Brijesh Singh wrote:
SEV-SNP specification provides the guest a mechanism to communicate with
^ The
the PSP without risk from a malicious hypervisor who wishes to read, alter, drop or replay the messages sent. The driver uses snp_issue_guest_request() to issue GHCB SNP_GUEST_REQUEST or SNP_EXT_GUEST_REQUEST NAE events to submit the request to PSP. The PSP requires that all communication should be encrypted using key specified through the platform_data. The userspace can use SNP_GET_REPORT ioctl() to query the guest
s/The u/U/
quoted hunk ↗ jump to hunk
attestation report. See SEV-SNP spec section Guest Messages for more details. Signed-off-by: Brijesh Singh <redacted> --- Documentation/virt/coco/sevguest.rst | 81 ++++ drivers/virt/Kconfig | 3 + drivers/virt/Makefile | 1 + drivers/virt/coco/sevguest/Kconfig | 12 + drivers/virt/coco/sevguest/Makefile | 2 + drivers/virt/coco/sevguest/sevguest.c | 605 ++++++++++++++++++++++++++ drivers/virt/coco/sevguest/sevguest.h | 98 +++++ include/uapi/linux/sev-guest.h | 50 +++ 8 files changed, 852 insertions(+) create mode 100644 Documentation/virt/coco/sevguest.rst create mode 100644 drivers/virt/coco/sevguest/Kconfig create mode 100644 drivers/virt/coco/sevguest/Makefile create mode 100644 drivers/virt/coco/sevguest/sevguest.c create mode 100644 drivers/virt/coco/sevguest/sevguest.h create mode 100644 include/uapi/linux/sev-guest.hdiff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst new file mode 100644 index 000000000000..47ef3b0821d5 --- /dev/null +++ b/Documentation/virt/coco/sevguest.rst
Adding documentation is all fine and good but you need to link that file into the TOC somewhere so that it is visible when the documentation gets generated... I guess something like this:
diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst
index edea7fea95a8..40ad0d20032e 100644
--- a/Documentation/virt/index.rst
+++ b/Documentation/virt/index.rst@@ -13,6 +13,7 @@ Linux Virtualization Support guest-halt-polling ne_overview acrn/index + coco/sevguest .. only:: html and subproject
And once you do, do "make htmldocs" to see whether it complains about some formatting issues or other errors etc. /me goes and does it... Ah, here we go: Documentation/virt/coco/sevguest.rst:48: WARNING: Inline emphasis start-string without end-string. Documentation/virt/coco/sevguest.rst:51: WARNING: Inline emphasis start-string without end-string. Documentation/virt/coco/sevguest.rst:55: WARNING: Inline emphasis start-string without end-string. Documentation/virt/coco/sevguest.rst:57: WARNING: Definition list ends without a blank line; unexpected unindent. There's something it doesn't like about the struct. Yeah, when I look at the html output, it is all weird and not monospaced...
quoted hunk ↗ jump to hunk
@@ -0,0 +1,81 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=================================================================== +The Definitive SEV Guest API Documentation +=================================================================== + +1. General description +====================== + +The SEV API is a set of ioctls that are used by the guest or hypervisor +to get or set certain aspect of the SEV virtual machine. The ioctls belong
^ a
+to the following classes: + + - Hypervisor ioctls: These query and set global attributes which affect the + whole SEV firmware. These ioctl are used by platform provision tools.
"provisioning" maybe?
+ + - Guest ioctls: These query and set attributes of the SEV virtual machine. + +2. API description +================== + +This section describes ioctls that can be used to query or set SEV guests.
^^^^^^^^^^^^^^^ That sounds weird.
+For each ioctl, the following information is provided along with a +description: + + Technology: + which SEV technology provides this ioctl. sev, sev-es, sev-snp or all.
^^^^^^^^^^^^^^^^^^^^ Marketing is going to scold you - those need to be in all caps. :-P
+ + Type: + hypervisor or guest. The ioctl can be used inside the guest or the + hypervisor. + + Parameters: + what parameters are accepted by the ioctl. + + Returns: + the return value. General error numbers (ENOMEM, EINVAL)
Those are negative: -ENOMEM, -EINVAL
+ are not detailed, but errors with specific meanings are. + +The guest ioctl should be issued on a file descriptor of the /dev/sev-guest device. +The ioctl accepts struct snp_user_guest_request. The input and output structure is +specified through the req_data and resp_data field respectively. If the ioctl fails +to execute due to a firmware error, then fw_err code will be set otherwise the +fw_err will be set to 0xff.
fw_err is u64. What does 0xff mean? Everything above the least significant byte is reserved 0?
quoted hunk ↗ jump to hunk
diff --git a/drivers/virt/coco/sevguest/Kconfig b/drivers/virt/coco/sevguest/Kconfig new file mode 100644 index 000000000000..07ab9ec6471c --- /dev/null +++ b/drivers/virt/coco/sevguest/Kconfig@@ -0,0 +1,12 @@ +config SEV_GUEST + tristate "AMD SEV Guest driver" + default y
Definitely not. We don't enable drivers by default unless they're ubiquitous.
+ depends on AMD_MEM_ENCRYPT && CRYPTO_AEAD2 + help + SEV-SNP firmware provides the guest a mechanism to communicate with + the PSP without risk from a malicious hypervisor who wishes to read, + alter, drop or replay the messages sent. The driver provides + userspace interface to communicate with the PSP to request the + attestation report and more. + + If you choose 'M' here, this module will be called sevguest.
...
+static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
+{
+ char zero_key[VMPCK_KEY_LEN] = {0};
+
+ if (snp_dev->vmpck)
+ return memcmp(snp_dev->vmpck, zero_key, VMPCK_KEY_LEN) == 0;return !memcmp(...);
+
+ return true;
+}
+
+static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
+{
+ memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
+ snp_dev->vmpck = NULL;
+}
+
+static inline u64 __snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+ u64 count;
+
+ lockdep_assert_held(&snp_cmd_mutex);
+
+ /* Read the current message sequence counter from secrets pages */
+ count = *snp_dev->os_area_msg_seqno;
+
+ return count + 1;
+}
+
+/* Return a non-zero on success */
+static u64 snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+ u64 count = __snp_get_msg_seqno(snp_dev);
+
+ /*
+ * The message sequence counter for the SNP guest request is a 64-bit
+ * value but the version 2 of GHCB specification defines a 32-bit storage
+ * for it. If the counter exceeds the 32-bit value then return zero.
+ * The caller should check the return value, but if the caller happens to
+ * not check the value and use it, then the firmware treats zero as an
+ * invalid number and will fail the message request.
+ */
+ if (count >= UINT_MAX) {
+ pr_err_ratelimited("SNP guest request message sequence counter overflow\n");How does error message help the user? Is she supposed to reboot the machine or so? Because it sounds to me like if this goes over 32-bit, this driver stops working. So what resets those sequence numbers?
+ return 0;
+ }
+
+ return count;
+}
+
+static void snp_inc_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+ /*
+ * The counter is also incremented by the PSP, so increment it by 2
+ * and save in secrets page.
+ */
+ *snp_dev->os_area_msg_seqno += 2;
+}
+
+static inline struct snp_guest_dev *to_snp_dev(struct file *file)
+{
+ struct miscdevice *dev = file->private_data;
+
+ return container_of(dev, struct snp_guest_dev, misc);
+}
+
+static struct snp_guest_crypto *init_crypto(struct snp_guest_dev *snp_dev, u8 *key, size_t keylen)
+{
+ struct snp_guest_crypto *crypto;
+
+ crypto = kzalloc(sizeof(*crypto), GFP_KERNEL_ACCOUNT);
+ if (!crypto)
+ return NULL;
+
+ crypto->tfm = crypto_alloc_aead("gcm(aes)", 0, 0);
+ if (IS_ERR(crypto->tfm))
+ goto e_free;
+
+ if (crypto_aead_setkey(crypto->tfm, key, keylen))
+ goto e_free_crypto;
+
+ crypto->iv_len = crypto_aead_ivsize(crypto->tfm);
+ if (crypto->iv_len < 12) {
+ dev_err(snp_dev->dev, "IV length is less than 12.\n");And? < 12 is bad? Make that error message more user-friendly pls.
+ goto e_free_crypto;
+ }
+
+ crypto->iv = kmalloc(crypto->iv_len, GFP_KERNEL_ACCOUNT);
+ if (!crypto->iv)
+ goto e_free_crypto;
+
+ if (crypto_aead_authsize(crypto->tfm) > MAX_AUTHTAG_LEN) {
+ if (crypto_aead_setauthsize(crypto->tfm, MAX_AUTHTAG_LEN)) {
+ dev_err(snp_dev->dev, "failed to set authsize to %d\n", MAX_AUTHTAG_LEN);
+ goto e_free_crypto;
+ }
+ }
+
+ crypto->a_len = crypto_aead_authsize(crypto->tfm);
+ crypto->authtag = kmalloc(crypto->a_len, GFP_KERNEL_ACCOUNT);
+ if (!crypto->authtag)
+ goto e_free_crypto;
+
+ return crypto;
+
+e_free_crypto:
+ crypto_free_aead(crypto->tfm);
+e_free:
+ kfree(crypto->iv);
+ kfree(crypto->authtag);
+ kfree(crypto);The order of those free calls needs to be the opposite of the kmallocs above.
+ + return NULL; +}
...
+static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver,
+ u8 type, void *req_buf, size_t req_sz, void *resp_buf,
+ u32 resp_sz, __u64 *fw_err)
+{
+ unsigned long err;
+ u64 seqno;
+ int rc;
+
+ /* Get message sequence and verify that its a non-zero */
+ seqno = snp_get_msg_seqno(snp_dev);
+ if (!seqno)
+ return -EIO;
+
+ memset(snp_dev->response, 0, sizeof(*snp_dev->response));sizeof(struct snp_guest_msg)
+
+ /* Encrypt the userspace provided payload */
+ rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz);
+ if (rc)
+ return rc;
+
+ /* Call firmware to process the request */
+ rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
+ if (fw_err)
+ *fw_err = err;
+
+ if (rc)
+ return rc;
+
+ rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
+ if (rc) {
+ /*
+ * The verify_and_dec_payload() will fail only if the hypervisor is
+ * actively modifying the message header or corrupting the encrypted payload.
+ * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that
+ * the key cannot be used for any communication. The key is disabled to ensure
+ * that AES-GCM does not use the same IV while encrypting the request payload.
+ */Put that comment over the function call.
+ dev_alert(snp_dev->dev,
+ "Detected unexpected decode failure, disabling the vmpck_id %d\n",
+ vmpck_id);
+ snp_disable_vmpck(snp_dev);
+ return rc;
+ }
+
+ /* Increment to new message sequence after payload decryption was successful. */
+ snp_inc_msg_seqno(snp_dev);
+
+ return 0;
+}
+
+static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+{
+ struct snp_guest_crypto *crypto = snp_dev->crypto;
+ struct snp_report_req req = {0};
+ struct snp_report_resp *resp;
+ int rc, resp_len;lockdep_assert_held(&snp_cmd_mutex);
+ if (!arg->req_data || !arg->resp_data)
+ return -EINVAL;
+
+ /* Copy the request payload from userspace */
+ if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+ return -EFAULT;
+
+ /*
+ * The intermediate response buffer is used while decrypting the
+ * response payload. Make sure that it has enough space to cover the
+ * authtag.
+ */
+ resp_len = sizeof(resp->data) + crypto->a_len;
+ resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+ if (!resp)
+ return -ENOMEM;
+
+ /* Issue the command to get the attestation report */
+ rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
+ SNP_MSG_REPORT_REQ, &req, sizeof(req), resp->data,
+ resp_len, &arg->fw_err);
+ if (rc)
+ goto e_free;
+
+ /* Copy the response payload to userspace */
+ if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+ rc = -EFAULT;
+
+e_free:
+ kfree(resp);
+ return rc;
+}
+
+static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
+{
+ struct snp_guest_dev *snp_dev = to_snp_dev(file);
+ void __user *argp = (void __user *)arg;
+ struct snp_guest_request_ioctl input;
+ int ret = -ENOTTY;
+
+ if (copy_from_user(&input, argp, sizeof(input)))
+ return -EFAULT;
+
+ input.fw_err = 0xff;
+
+ /* Message version must be non-zero */
+ if (!input.msg_version)
+ return -EINVAL;
+
+ mutex_lock(&snp_cmd_mutex);That mutex probably is to be held while issuing SNP commands but then you hold it here already for...
+
+ /* Check if the VMPCK is not empty */
+ if (is_vmpck_empty(snp_dev)) {... this here which is not really a SNP command issuing. Should that mutex be grabbed only around handle_guest_request() above or is it supposed to protect more stuff?
+ dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
+ mutex_unlock(&snp_cmd_mutex);
+ return -ENOTTY;
+ }
+
+ switch (ioctl) {
+ case SNP_GET_REPORT:
+ ret = get_report(snp_dev, &input);
+ break;
+ default:
+ break;
+ }
+
+ mutex_unlock(&snp_cmd_mutex);
+
+ if (input.fw_err && copy_to_user(argp, &input, sizeof(input)))
+ return -EFAULT;
+
+ return ret;
+}
+
+static void free_shared_pages(void *buf, size_t sz)
+{
+ unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+ if (!buf)
+ return;
+
+ /* If fail to restore the encryption mask then leak it. */Useless comment - the error message says the same.
+ if (WARN_ONCE(set_memory_encrypted((unsigned long)buf, npages),
+ "Failed to restore encryption mask (leak it)\n"))
+ return;
+
+ __free_pages(virt_to_page(buf), get_order(sz));
+}
+
+static void *alloc_shared_pages(size_t sz)
+{
+ unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+ struct page *page;
+ int ret;
+
+ page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz));
+ if (IS_ERR(page))
+ return NULL;
+
+ ret = set_memory_decrypted((unsigned long)page_address(page), npages);
+ if (ret) {
+ pr_err("SEV-SNP: failed to mark page shared, ret=%d\n", ret);Use pr_fmt, grep the tree for an example how and drop "SEV-SNP:" - that prefix should be "sevguest:".
+ __free_pages(page, get_order(sz)); + return NULL; + } + + return page_address(page); +}
...
+static int __init snp_guest_probe(struct platform_device *pdev)
+{
+ struct snp_secrets_page_layout *layout;
+ struct snp_guest_platform_data *data;
+ struct device *dev = &pdev->dev;
+ struct snp_guest_dev *snp_dev;
+ struct miscdevice *misc;
+ int ret;
+
+ if (!dev->platform_data)
+ return -ENODEV;
+
+ data = (struct snp_guest_platform_data *)dev->platform_data;
+ layout = (__force void *)ioremap_encrypted(data->secrets_gpa, PAGE_SIZE);
+ if (!layout)
+ return -ENODEV;
+
+ ret = -ENOMEM;
+ snp_dev = devm_kzalloc(&pdev->dev, sizeof(struct snp_guest_dev), GFP_KERNEL);
+ if (!snp_dev)
+ goto e_fail;
+
+ ret = -EINVAL;
+ snp_dev->vmpck = get_vmpck(vmpck_id, layout, &snp_dev->os_area_msg_seqno);
+ if (!snp_dev->vmpck) {
+ dev_err(dev, "invalid vmpck id %d\n", vmpck_id);
+ goto e_fail;
+ }
+
+ /* Verify that VMPCK is not zero. */
+ if (is_vmpck_empty(snp_dev)) {
+ dev_err(dev, "vmpck id %d is null\n", vmpck_id);
+ goto e_fail;
+ }
+
+ platform_set_drvdata(pdev, snp_dev);
+ snp_dev->dev = dev;
+ snp_dev->layout = layout;
+
+ /* Allocate the shared page used for the request and response message. */
+ snp_dev->request = alloc_shared_pages(sizeof(struct snp_guest_msg));
+ if (!snp_dev->request)
+ goto e_fail;
+
+ snp_dev->response = alloc_shared_pages(sizeof(struct snp_guest_msg));
+ if (!snp_dev->response)
+ goto e_fail;
+
+ ret = -EIO;
+ snp_dev->crypto = init_crypto(snp_dev, snp_dev->vmpck, VMPCK_KEY_LEN);
+ if (!snp_dev->crypto)
+ goto e_fail;
+
+ misc = &snp_dev->misc;
+ misc->minor = MISC_DYNAMIC_MINOR;
+ misc->name = DEVICE_NAME;
+ misc->fops = &snp_guest_fops;
+
+ /* initial the input address for guest request */
+ snp_dev->input.req_gpa = __pa(snp_dev->request);
+ snp_dev->input.resp_gpa = __pa(snp_dev->response);
+
+ ret = misc_register(misc);
+ if (ret)
+ goto e_fail;
+
+ dev_info(dev, "Initialized SEV-SNP guest driver (using vmpck_id %d)\n", vmpck_id);
+ return 0;
+
+e_fail:
+ iounmap(layout);
+ free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
+ free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));The freeing needs to happen in the opposite order of the allocations. Audit all gotos.
quoted hunk ↗ jump to hunk
+ + return ret; +} + +static int __exit snp_guest_remove(struct platform_device *pdev) +{ + struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev); + + free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg)); + free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg)); + deinit_crypto(snp_dev->crypto); + misc_deregister(&snp_dev->misc); + + return 0; +} + +static struct platform_driver snp_guest_driver = { + .remove = __exit_p(snp_guest_remove), + .driver = { + .name = "snp-guest", + }, +}; + +module_platform_driver_probe(snp_guest_driver, snp_guest_probe); + +MODULE_AUTHOR("Brijesh Singh [off-list ref]"); +MODULE_LICENSE("GPL"); +MODULE_VERSION("1.0.0"); +MODULE_DESCRIPTION("AMD SNP Guest Driver");diff --git a/drivers/virt/coco/sevguest/sevguest.h b/drivers/virt/coco/sevguest/sevguest.h new file mode 100644 index 000000000000..cfa76cf8a21a --- /dev/null +++ b/drivers/virt/coco/sevguest/sevguest.h@@ -0,0 +1,98 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2021 Advanced Micro Devices, Inc. + * + * Author: Brijesh Singh <brijesh.singh@amd.com> + * + * SEV-SNP API spec is available at https://developer.amd.com/sev + */ + +#ifndef __LINUX_SEVGUEST_H_
I guess VIRT_SEVGUEST_H is better fitting.
+#define __LINUX_SEVGUEST_H_
...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette