Re: [patch -rt 1/2] KVM: use simple waitqueue for vcpu->wq
From: Peter Zijlstra <peterz@infradead.org>
Date: 2015-08-07 10:58:27
Also in:
kvm, lkml
On Wed, Feb 25, 2015 at 10:02:50PM +0100, Sebastian Andrzej Siewior wrote:
quoted
+static inline int swait_active(struct swait_queue_head *q) +{ + return !list_empty(&q->task_list);In RT there was a smp_mb() which you dropped and I assume you had reasons for it.
Yeah, RT didn't have a reason for the smp_mb() -- any barrier without a comment is a bug :-) Also waitqueue_active(), its counterpart, does not have a barrier there either. Nor did I see any reason for that mb to be there.
I assumed that one can perform list_empty_careful() without a lock if the items were removed with list_del_init(). But since nothing in -RT blow up so far I guess this here is legal, too :)
Nobody will come and arrest us for software bugs -- yet ;-)
quoted
+/* + * The thing about the wake_up_state() return value; I think we can ignore it. + * + * If for some reason it would return 0, that means the previously waiting + * task is already running, so it will observe condition true (or has already). + */ +void swake_up_locked(struct swait_queue_head *q) +{ + struct swait_queue *curr; + + list_for_each_entry(curr, &q->task_list, task_list) { + wake_up_process(curr->task);okay. So since we limit everything to TASK_NORMAL which has to sleep while on the list there is no need to check if we actually woken up someone.
Partly that, also that I don't see how that return value is meaningful in the first place. If it were to return false, the task was/is already running and it will observe whatever condition we just satisfied to allow waking things up. So either way around, we'll get (at least) 1 task running.
quoted
+ list_del_init(&curr->task_list); + break; + } +} +EXPORT_SYMBOL(swake_up_locked);
quoted
+void swake_up(struct swait_queue_head *q) +{ + unsigned long flags; + + if (!swait_active(q)) + return; + + raw_spin_lock_irqsave(&q->lock, flags); + __swake_up_locked(q);I thing this should have been swake_up_locked() instead since __swake_up_locked() isn't part of this patch. Just a nitpick: later there is __prepare_to_swait() and __finish_swait() which have the __ prefix instead a _locked suffix. Not sure what is better for a better for a public API but maybe one way would be good.
Yeah, I suppose that's true ;-)
quoted
+ raw_spin_unlock_irqrestore(&q->lock, flags); +} +EXPORT_SYMBOL(swake_up); + +/* + * Does not allow usage from IRQ disabled, since we must be able to + * release IRQs to guarantee bounded hold time. + */ +void swake_up_all(struct swait_queue_head *q) +{ + struct swait_queue *curr, *next; + LIST_HEAD(tmp);WARN_ON(irqs_disabled()) ?
Lockdep should already catch that by virtue of using unconditional _irq spinlock primitives.
quoted
+ if (!swait_active(q)) + return; + + raw_spin_lock_irq(&q->lock); + list_splice_init(&q->task_list, &tmp); + while (!list_empty(&tmp)) { + curr = list_first_entry(&tmp, typeof(curr), task_list); + + wake_up_state(curr->task, state); + list_del_init(&curr->task_list);So because the task may timeout and remove itself from the list at anytime you need to hold the lock during wakeup and the removal from the list
Indeed.
quoted
+ + if (list_empty(&tmp)) + break; + + raw_spin_unlock_irq(&q->lock);and you drop the lock after each iteration in case there is an IRQ pending or the task, that has been just woken up, has a higher priority than the current task and needs to get on the CPU.
Not sure if this case matters: - _this_ task (wake_all) prio 120 - first task in queue prio 10, RR - second task in queue prio 9, RR
Why complicate things? Better to not assume anything and just do the simple correct thing.
the *old* behavior would put the second task before the first task on CPU. The *new* behaviour puts the first task on the CPU after dropping the lock. The second task (that has a higher priority but nobody knows) has to wait until the first one is done (and anything else that might been woken up in the meantime with a higher prio than 120).
Irrelevant, we _must_ drop the lock in order to maintain bounded behaviour.
quoted
+ raw_spin_lock_irq(&q->lock); + } + raw_spin_unlock_irq(&q->lock); +} +EXPORT_SYMBOL(swake_up_all);
quoted
+void __finish_swait(struct swait_queue_head *q, struct swait_queue *wait)this one has no users the __ suggests that it is locked edition. Maybe it is for the completions…
Yeah, who knows, I certainly do not anymore ;-)