Re: [PATCH v2] s390/vfio-ap: fix memory leak in mdev remove callback
From: Halil Pasic <pasic@linux.ibm.com>
Date: 2021-05-19 12:59:53
Also in:
linux-s390, lkml
On Wed, 19 May 2021 13:22:56 +0200 Christian Borntraeger [off-list ref] wrote:
On 19.05.21 10:17, Christian Borntraeger wrote:quoted
On 19.05.21 01:27, Halil Pasic wrote:quoted
On Tue, 18 May 2021 19:01:42 +0200 Christian Borntraeger [off-list ref] wrote:quoted
On 18.05.21 17:33, Halil Pasic wrote:quoted
On Tue, 18 May 2021 15:59:36 +0200 Christian Borntraeger [off-list ref] wrote:[..]quoted
quoted
quoted
quoted
quoted
Would it help, if the code in priv.c would read the hook once and then only work on the copy? We could protect that with rcu and do a synchronize rcu in vfio_ap_mdev_unset_kvm after unsetting the pointer?Unfortunately just "the hook" is ambiguous in this context. We have kvm->arch.crypto.pqap_hook that is supposed to point to a struct kvm_s390_module_hook member of struct ap_matrix_mdev which is also called pqap_hook. And struct kvm_s390_module_hook has function pointer member named "hook".I was referring to the full struct.quoted
quoted
quoted
I'll look into this.I think it could work. in priv.c use rcu_readlock, save the pointer, do the check and call, call rcu_read_unlock. In vfio_ap use rcu_assign_pointer to set the pointer and after setting it to zero call sychronize_rcu.In my opinion, we should make the accesses to the kvm->arch.crypto.pqap_hook pointer properly synchronized. I'm not sure if that is what you are proposing. How do we usually do synchronisation on the stuff that lives in kvm->arch?RCU is a method of synchronization. We make sure that structure pqap_hook is still valid as long as we are inside the rcu read lock. So the idea is: clear pointer, wait until all old readers have finished and the proceed with getting rid of the structure.Yes I know that RCU is a method of synchronization, but I'm not very familiar with it. I'm a little confused by "read the hook once and then work on a copy". I guess, I would have to read up on the RCU again to get clarity. I intend to brush up my RCU knowledge once the patch comes along. I would be glad to have your help when reviewing an RCU based solution for this.Just had a quick look. Its not trivial, as the hook function itself takes a mutex and an rcu section must not sleep. Will have a deeper look.As a quick hack something like this could work. The whole locking is pretty complicated and this makes it even more complex so we might want to do a cleanup/locking rework later on.
Hm, seems our emails crossed mid air...
quoted hunk ↗ jump to hunk
index 9928f785c677..fde6e02aab54 100644--- a/arch/s390/kvm/priv.c +++ b/arch/s390/kvm/priv.c@@ -609,6 +609,7 @@ static int handle_io_inst(struct kvm_vcpu *vcpu) */ static int handle_pqap(struct kvm_vcpu *vcpu) { + struct kvm_s390_module_hook *pqap_hook; struct ap_queue_status status = {}; unsigned long reg0; int ret;@@ -657,14 +658,21 @@ static int handle_pqap(struct kvm_vcpu *vcpu) * Verify that the hook callback is registered, lock the owner * and call the hook. */ - if (vcpu->kvm->arch.crypto.pqap_hook) { - if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner)) + rcu_read_lock(); + pqap_hook = rcu_dereference(vcpu->kvm->arch.crypto.pqap_hook); + if (pqap_hook) { + if (!try_module_get(pqap_hook->owner)) { + rcu_read_unlock(); return -EOPNOTSUPP; - ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu); - module_put(vcpu->kvm->arch.crypto.pqap_hook->owner); + }
Up to this point the local pqap_hook is guaranteed to point to a valid object if not NULL, ...
+ rcu_read_unlock();
... and after this point IMHO it is not.
+ ret = pqap_hook->hook(vcpu);
So IMHO the pointer deference here is still problematic, but that can be fixed easily as I described in that email I've sent 3 minutes after yours. IMHO we need a local copy of cpu->kvm->arch.crypto.pqap_hook->hook taken within the rcu read critical section. Do you agree? Regards, Halil
quoted hunk ↗ jump to hunk
+ module_put(pqap_hook->owner); if (!ret && vcpu->run->s.regs.gprs[1] & 0x00ff0000) kvm_s390_set_psw_cc(vcpu, 3); return ret; + } else { + rcu_read_unlock(); } /* * A vfio_driver must register a hook.diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index f90c9103dac2..a7124abd6aed 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c@@ -1194,6 +1194,7 @@ static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev) mutex_lock(&matrix_dev->lock); vfio_ap_mdev_reset_queues(matrix_mdev->mdev); matrix_mdev->kvm->arch.crypto.pqap_hook = NULL; + synchronize_rcu(); kvm_put_kvm(matrix_mdev->kvm); matrix_mdev->kvm = NULL; matrix_mdev->kvm_busy = false;