Thread (30 messages) 30 messages, 9 authors, 2017-07-12

Re: [PATCH 2/4] swait: add the missing killable swaits

From: Marcelo Tosatti <hidden>
Date: 2017-06-30 11:57:14
Also in: linux-api, linux-fsdevel, lkml

On Thu, Jun 29, 2017 at 09:03:42PM -0700, Linus Torvalds wrote:
On Thu, Jun 29, 2017 at 12:15 PM, Marcelo Tosatti [off-list ref] wrote:
quoted
On Thu, Jun 29, 2017 at 09:13:29AM -0700, Linus Torvalds wrote:
quoted
swait uses special locking and has odd semantics that are not at all
the same as the default wait queue ones. It should not be used without
very strong reasons (and honestly, the only strong enough reason seems
to be "RT").
Performance shortcut:

https://lkml.org/lkml/2016/2/25/301
Yes, I know why kvm uses it, I just don't think it's necessarily the
right thing.

That kvm commit is actually a great example: it uses swake_up() from
an interrupt, and that's in fact the *reason* it uses swake_up().

But that also fundamentally means that it cannot use swake_up_all(),
so it basically *relies* on there only ever being one single entry
that needs to be woken up.

And as far as I can tell, it really is because the queue only ever has
one entry (ie it's per-vcpu, and when the vcpu is blocked, it's
blocked - so no other user will be waiting there).
Exactly.
So it isn't that you migth queue multiple entries and then just wake
them up one at a time. There really is just one entry at a time,
right?
Yes.
And that means that swait is actuially completely the wrong thing to
do. It's more expensive and more complex than just saving the single
process pointer away and just doing "wake_up_process()".
Aha, i see.
Now, it really is entirely possible that I'm missing something, but it
does look like that to me.
Just drop it -- the optimization is not relevant anymore given VMX
hardware improvements.
We've had wake_up_process() since pretty much day #1. THAT is the
fastest and simplest direct wake-up there is, not some "simple
wait-queue".

Now, admittedly I don't know the code and really may be entirely off,
but looking at the commit (no need to go to the lkml archives - it's
commit 8577370fb0cb ("KVM: Use simple waitqueue for vcpu->wq") in
mainline), I really think the swait() use is simply not correct if
there can be multiple waiters, exactly because swake_up() only wakes
up a single entry.
There can't be: its one emulated LAPIC per vcpu. So only one vcpu
waits for that waitqueue.
So either there is only a single entry, or *all* the code like

        dvcpu->arch.wait = 0;

-       if (waitqueue_active(&dvcpu->wq))
-               wake_up_interruptible(&dvcpu->wq);
+       if (swait_active(&dvcpu->wq))
+               swake_up(&dvcpu->wq);

is simply wrong. If there are multiple blockers, and you just cleared
"arch.wait", I think they should *all* be woken up. And that's not
what swake_up() does.

So I think that kvm_vcpu_block() could easily have instead done

    vcpu->process = current;

as the "prepare_to_wait()" part, and "finish_wait()" would be to just
clear vcpu->process. No wait-queue, just a single pointer to the
single blocking thread.

(Of course, you still need serialization, so that
"wake_up_process(vcpu->process)" doesn't end up using a stale value,
but since processes are already freed with RCU because of other things
like that, the serialization is very low-cost, you only need to be
RCU-read safe when waking up).

See what I'm saying?

Note that "wake_up_process()" really is fairly widely used. It's
widely used because it's fairly obvious, and because that really *is*
the lowest-possible cost: a single pointer to the sleeping thread, and
you can often do almost no locking at all.

And unlike swake_up(), it's obvious that you only wake up a single thread.

           Linus
Feel free to drop the KVM usage... agreed the interface is a special 
case and a generic one which handles multiple waiters 
and has debugging etc should be preferred to avoid bugs

Not sure if other people are using it (swait). 
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help