Thread (40 messages) 40 messages, 6 authors, 2022-01-20

Re: [RFC][PATCH 3/3] sched: User Mode Concurency Groups

From: Peter Oskolkov <hidden>
Date: 2022-01-19 17:33:32
Also in: linux-mm, lkml

On Wed, Jan 19, 2022 at 12:47 AM Peter Zijlstra [off-list ref] wrote:
On Tue, Jan 18, 2022 at 10:19:21AM -0800, Peter Oskolkov wrote:
quoted
============= worker-to-worker context switches

One example: absl::Mutex (https://abseil.io/about/design/mutex) has
google-internal extensions that are "fiber aware". More specifically,
consider this situation:

- worker W1 acqured the mutex and is doing its work
- worker W2 calls mutex::lock()
  mutex::lock(), being aware of workers, understands that W2 is going to sleep;
  so instead of just doing so, waking the server, and letting
  the server figure out what to run in place of the sleeping worker,
mutex::lock()
  calls into the userspace scheduler in the context of W2 running, and the
  userspace scheduler then picks W3 to run and does W2->W3 context switch.

The optimization above replaces W2->Server and Server->W3 context switches
with a single W2->W3 context switch, which is a material performance gain.
Yes, I've also already reconsidered. Things like pipelines and other
fixed order scheduling policies will greatly benefit from
worker-to-worker switching.

But I think all of them are explicit. That is, we can limit the
::next_tid usage to sys_umcg_wait() and never look at it for implicit
blocks.
Yes, of course - when a worker blocks, its server gets notified.
quoted
In addition, when W1 calls mutex::unlock(), the scheduling code determines
that W2 is waiting on the mutex, and thus calls W2::wake() from the context of
running W1 (you asked earlier why do we need "WAKE_ONLY").
This I'm not at all convinced on. That sounds like it will violate the
1:1 thing.
wake_only is a wakeup event, meaning the worker gets added to the wake
queue, not scheduled on a CPU; we don't have to implement it in the
kernel, though - the userspace may keep its own wake queue for workers
like this. So feel free to ignore this operation.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help