Thread (19 messages) 19 messages, 6 authors, 2021-03-24

Re: [Patch v3 1/2] cgroup: sev: Add misc cgroup controller

From: Michal Koutný <mkoutny@suse.com>
Date: 2021-03-11 18:59:58
Also in: cgroups, kvm, lkml

Hi Vipin.

On Thu, Mar 04, 2021 at 03:19:45PM -0800, Vipin Sharma [off-list ref] wrote:
 arch/x86/kvm/svm/sev.c        |  65 +++++-
 arch/x86/kvm/svm/svm.h        |   1 +
 include/linux/cgroup_subsys.h |   4 +
 include/linux/misc_cgroup.h   | 130 +++++++++++
 init/Kconfig                  |  14 ++
 kernel/cgroup/Makefile        |   1 +
 kernel/cgroup/misc.c          | 402 ++++++++++++++++++++++++++++++++++
Given different two-fold nature (SEV caller vs misc controller) of some
remarks below, I think it makes sense to split this into two patches:
a) generic controller implementation,
b) hooking the controller into SEV ASIDs management.
+#ifndef CONFIG_KVM_AMD_SEV
+/*
+ * When this config is not defined, SEV feature is not supported and APIs in
+ * this file are not used but this file still gets compiled into the KVM AMD
+ * module.
+ *
+ * We will not have MISC_CG_RES_SEV and MISC_CG_RES_SEV_ES entries in the enum
+ * misc_res_type {} defined in linux/misc_cgroup.h.
BTW, was there any progress on conditioning sev.c build on
CONFIG_KVM_AMD_SEV? (So that the defines workaround isn't needeed.)
 static int sev_asid_new(struct kvm_sev_info *sev)
 {
-	int pos, min_asid, max_asid;
+	int pos, min_asid, max_asid, ret;
 	bool retry = true;
+	enum misc_res_type type;
+
+	type = sev->es_active ? MISC_CG_RES_SEV_ES : MISC_CG_RES_SEV;
+	sev->misc_cg = get_current_misc_cg();
+	ret = misc_cg_try_charge(type, sev->misc_cg, 1);
It may be safer to WARN_ON(sev->misc_cg) at this point (see below).

[...]
+e_uncharge:
+	misc_cg_uncharge(type, sev->misc_cg, 1);
+	put_misc_cg(sev->misc_cg);
+	return ret;
vvv
quoted hunk ↗ jump to hunk
@@ -140,6 +171,10 @@ static void sev_asid_free(int asid)
 	}
 
 	mutex_unlock(&sev_bitmap_lock);
+
+	type = sev->es_active ? MISC_CG_RES_SEV_ES : MISC_CG_RES_SEV;
+	misc_cg_uncharge(type, sev->misc_cg, 1);
+	put_misc_cg(sev->misc_cg);
It may be safer to set sev->misc_cg to NULL here.

(IIUC, with current asid_{new,free} calls it shouldn't matter but why to
rely on it in the future.)
quoted hunk ↗ jump to hunk
+++ b/kernel/cgroup/misc.c
[...]
+static void misc_cg_reduce_charge(enum misc_res_type type, struct misc_cg *cg,
+				  unsigned long amount)
misc_cg_cancel_charge seems to be a name more consistent with what we
already have in pids and memory controller.

+static ssize_t misc_cg_max_write(struct kernfs_open_file *of, char *buf,
+				 size_t nbytes, loff_t off)
+{
[...]
+
+	if (!strcmp(MAX_STR, buf)) {
+		max = ULONG_MAX;
MAX_NUM for consistency with other places.
+	} else {
+		ret = kstrtoul(buf, 0, &max);
+		if (ret)
+			return ret;
+	}
+
+	cg = css_misc(of_css(of));
+
+	if (misc_res_capacity[type])
+		cg->res[type].max = max;
In theory, parallel writers can clash here, so having the limit atomic
type to prevent this would resolve it. See also commit a713af394cf3
("cgroup: pids: use atomic64_t for pids->limit").
+static int misc_cg_current_show(struct seq_file *sf, void *v)
+{
+	int i;
+	struct misc_cg *cg = css_misc(seq_css(sf));
+
+	for (i = 0; i < MISC_CG_RES_TYPES; i++) {
+		if (misc_res_capacity[i])
Since there can be some residual charges after removing capacity (before
draining), maybe the condition along the line

if (misc_res_capacity[i] || atomic_long_read(&cg->res[i].usage))

would be more informative for debugging.
+static int misc_cg_capacity_show(struct seq_file *sf, void *v)
+{
+	int i;
+	unsigned long cap;
+
+	for (i = 0; i < MISC_CG_RES_TYPES; i++) {
+		cap = READ_ONCE(misc_res_capacity[i]);
Why is READ_ONCE only here and not in other places that (actually) check
against the set capacity value? Also, there should be a paired
WRITE_ONCCE in misc_cg_set_capacity().


Thanks,
Michal

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help