Re: [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
From: Jason Wang <jasowang@redhat.com>
Date: 2019-08-05 04:20:55
Also in:
kvm, linux-mm, lkml
On 2019/8/2 下午8:46, Jason Gunthorpe wrote:
On Fri, Aug 02, 2019 at 05:40:07PM +0800, Jason Wang wrote:quoted
quoted
This must be a proper barrier, like a spinlock, mutex, or synchronize_rcu.I start with synchronize_rcu() but both you and Michael raise some concern.I've also idly wondered if calling synchronize_rcu() under the various mm locks is a deadlock situation.
Maybe, that's why I suggest to use vhost_work_flush() which is much lightweight can can achieve the same function. It can guarantee all previous work has been processed after vhost_work_flush() return.
quoted
Then I try spinlock and mutex: 1) spinlock: add lots of overhead on datapath, this leads 0 performance improvement.I think the topic here is correctness not performance improvement
But the whole series is to speed up vhost.
quoted
2) SRCU: full memory barrier requires on srcu_read_lock(), which still leads little performance improvementquoted
3) mutex: a possible issue is need to wait for the page to be swapped in (is this unacceptable ?), another issue is that we need hold vq lock during range overlap check.I have a feeling that mmu notififers cannot safely become dependent on progress of swap without causing deadlock. You probably should avoid this.
Yes, so that's why I try to synchronize the critical region by myself.
quoted
quoted
And, again, you can't re-invent a spinlock with open coding and get something better.So the question is if waiting for swap is considered to be unsuitable for MMU notifiers. If not, it would simplify codes. If not, we still need to figure out a possible solution. Btw, I come up another idea, that is to disable preemption when vhost thread need to access the memory. Then register preempt notifier and if vhost thread is preempted, we're sure no one will access the memory and can do the cleanup.I think you should use the spinlock so at least the code is obviously functionally correct and worry about designing some properly justified performance change after. Jason
Spinlock is correct but make the whole series meaningless consider it won't bring any performance improvement. Thanks