Thread (16 messages) 16 messages, 2 authors, 2023-10-25

Re: [PATCH-cgroup 1/4] workqueue: Add workqueue_unbound_exclude_cpumask() to exclude CPUs from wq_unbound_cpumask

From: Tejun Heo <tj@kernel.org>
Date: 2023-10-18 09:24:36
Also in: linux-doc, linux-kselftest, lkml

Hello,

On Fri, Oct 13, 2023 at 02:11:19PM -0400, Waiman Long wrote:
When the "isolcpus" boot command line option is used to add a set
of isolated CPUs, those CPUs will be excluded automatically from
wq_unbound_cpumask to avoid running work functions from unbound
workqueues.

Recently cpuset has been extended to allow the creation of partitions
of isolated CPUs dynamically. To make it closer to the "isolcpus"
in functionality, the CPUs in those isolated cpuset partitions should be
excluded from wq_unbound_cpumask as well. This can be done currently by
explicitly writing to the workqueue's cpumask sysfs file after creating
the isolated partitions. However, this process can be error prone.
Ideally, the cpuset code should be allowed to request the workqueue code
to exclude those isolated CPUs from wq_unbound_cpumask so that this
operation can be done automatically and the isolated CPUs will be returned
back to wq_unbound_cpumask after the destructions of the isolated
cpuset partitions.

This patch adds a new workqueue_unbound_exclude_cpumask() to enable
that. This new function will exclude the specified isolated CPUs
from wq_unbound_cpumask. To be able to restore those isolated CPUs
back after the destruction of isolated cpuset partitions, a new
wq_user_unbound_cpumask is added to store the user provided unbound
cpumask either from the boot command line options or from writing to
the cpumask sysfs file. This new cpumask provides the basis for CPU
exclusion.
The behaviors around wq_unbound_cpumask is getting pretty inconsistent:

1. Housekeeping excludes isolated CPUs on boot but allows user to override
   it to include isolated CPUs afterwards.

2. If an unbound wq's cpumask doesn't have any intersection with
   wq_unbound_cpumask we ignore the per-wq cpumask and falls back to
   wq_unbound_cpumask.

3. You're adding a masking layer on top with exclude which fails to set if
   the intersection is empty.

Can we do the followings for consistency?

1. User's requested_unbound_cpumask is stored separately (as in this patch).

2. The effect wq_unbound_cpumask is determined by requested_unbound_cpumask
   & housekeeping_cpumask & cpuset_allowed_cpumask. The operation order
   matters. When an & operation yields an cpumask, the cpumask from the
   previous step is the effective one.

3. Expose these cpumasks in sysfs so that what's happening is obvious.
+int workqueue_unbound_exclude_cpumask(cpumask_var_t exclude_cpumask)
+{
+	cpumask_var_t cpumask;
+	int ret = 0;
+
+	if (!zalloc_cpumask_var(&cpumask, GFP_KERNEL))
+		return -ENOMEM;
+
+	/*
+	 * The caller of this function may have called cpus_read_lock(),
+	 * use cpus_read_trylock() to avoid potential deadlock.
+	 */
+	if (!cpus_read_trylock())
+		return -EBUSY;
This means that a completely unrelated cpus_write_lock() can fail this
operation and thus cpuset config writes. Let's please not do this. Can't we
just make sure that the caller holds the lock?
+	mutex_lock(&wq_pool_mutex);
+
+	if (!cpumask_andnot(cpumask, wq_user_unbound_cpumask, exclude_cpumask))
+		ret = -EINVAL;	/* The new cpumask can't be empty */
For better or worse, the usual mode-of-failure for "no usable CPU" is just
falling back to something which works rather than failing the operation.
Let's follow that.

Thanks.

-- 
tejun
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help