Re: [Patch v3 1/2] cgroup: sev: Add misc cgroup controller
From: Michal Koutný <mkoutny@suse.com>
Date: 2021-03-11 18:59:58
Also in:
cgroups, kvm, lkml
Hi Vipin. On Thu, Mar 04, 2021 at 03:19:45PM -0800, Vipin Sharma [off-list ref] wrote:
arch/x86/kvm/svm/sev.c | 65 +++++- arch/x86/kvm/svm/svm.h | 1 + include/linux/cgroup_subsys.h | 4 + include/linux/misc_cgroup.h | 130 +++++++++++ init/Kconfig | 14 ++ kernel/cgroup/Makefile | 1 + kernel/cgroup/misc.c | 402 ++++++++++++++++++++++++++++++++++
Given different two-fold nature (SEV caller vs misc controller) of some remarks below, I think it makes sense to split this into two patches: a) generic controller implementation, b) hooking the controller into SEV ASIDs management.
+#ifndef CONFIG_KVM_AMD_SEV
+/*
+ * When this config is not defined, SEV feature is not supported and APIs in
+ * this file are not used but this file still gets compiled into the KVM AMD
+ * module.
+ *
+ * We will not have MISC_CG_RES_SEV and MISC_CG_RES_SEV_ES entries in the enum
+ * misc_res_type {} defined in linux/misc_cgroup.h.BTW, was there any progress on conditioning sev.c build on CONFIG_KVM_AMD_SEV? (So that the defines workaround isn't needeed.)
static int sev_asid_new(struct kvm_sev_info *sev)
{
- int pos, min_asid, max_asid;
+ int pos, min_asid, max_asid, ret;
bool retry = true;
+ enum misc_res_type type;
+
+ type = sev->es_active ? MISC_CG_RES_SEV_ES : MISC_CG_RES_SEV;
+ sev->misc_cg = get_current_misc_cg();
+ ret = misc_cg_try_charge(type, sev->misc_cg, 1);It may be safer to WARN_ON(sev->misc_cg) at this point (see below).
[...] +e_uncharge: + misc_cg_uncharge(type, sev->misc_cg, 1); + put_misc_cg(sev->misc_cg); + return ret;
vvv
quoted hunk ↗ jump to hunk
@@ -140,6 +171,10 @@ static void sev_asid_free(int asid) } mutex_unlock(&sev_bitmap_lock); + + type = sev->es_active ? MISC_CG_RES_SEV_ES : MISC_CG_RES_SEV; + misc_cg_uncharge(type, sev->misc_cg, 1); + put_misc_cg(sev->misc_cg);
It may be safer to set sev->misc_cg to NULL here.
(IIUC, with current asid_{new,free} calls it shouldn't matter but why to
rely on it in the future.)
quoted hunk ↗ jump to hunk
+++ b/kernel/cgroup/misc.c[...] +static void misc_cg_reduce_charge(enum misc_res_type type, struct misc_cg *cg, + unsigned long amount)
misc_cg_cancel_charge seems to be a name more consistent with what we already have in pids and memory controller.
+static ssize_t misc_cg_max_write(struct kernfs_open_file *of, char *buf,
+ size_t nbytes, loff_t off)
+{
[...]
+
+ if (!strcmp(MAX_STR, buf)) {
+ max = ULONG_MAX;MAX_NUM for consistency with other places.
+ } else {
+ ret = kstrtoul(buf, 0, &max);
+ if (ret)
+ return ret;
+ }
+
+ cg = css_misc(of_css(of));
+
+ if (misc_res_capacity[type])
+ cg->res[type].max = max;In theory, parallel writers can clash here, so having the limit atomic
type to prevent this would resolve it. See also commit a713af394cf3
("cgroup: pids: use atomic64_t for pids->limit").
+static int misc_cg_current_show(struct seq_file *sf, void *v)
+{
+ int i;
+ struct misc_cg *cg = css_misc(seq_css(sf));
+
+ for (i = 0; i < MISC_CG_RES_TYPES; i++) {
+ if (misc_res_capacity[i])Since there can be some residual charges after removing capacity (before draining), maybe the condition along the line if (misc_res_capacity[i] || atomic_long_read(&cg->res[i].usage)) would be more informative for debugging.
+static int misc_cg_capacity_show(struct seq_file *sf, void *v)
+{
+ int i;
+ unsigned long cap;
+
+ for (i = 0; i < MISC_CG_RES_TYPES; i++) {
+ cap = READ_ONCE(misc_res_capacity[i]);Why is READ_ONCE only here and not in other places that (actually) check against the set capacity value? Also, there should be a paired WRITE_ONCCE in misc_cg_set_capacity(). Thanks, Michal
Attachments
- signature.asc [application/pgp-signature] 833 bytes