Re: Question on guest enable msi fail when using GICv4/4.1
From: Jason Wang <jasowang@redhat.com>
Date: 2021-05-08 01:54:37
Also in:
kvm, kvmarm, linux-pci
在 2021/5/8 上午1:36, Marc Zyngier 写道:
On Fri, 07 May 2021 12:02:57 +0100, Marc Zyngier [off-list ref] wrote:quoted
On Fri, 07 May 2021 10:58:23 +0100, Shaokun Zhang [off-list ref] wrote:quoted
Hi Marc, Thanks for your quick reply. On 2021/5/7 17:03, Marc Zyngier wrote:quoted
On Fri, 07 May 2021 06:57:04 +0100, Shaokun Zhang [off-list ref] wrote:quoted
[This letter comes from Nianyao Tang] Hi, Using GICv4/4.1 and msi capability, guest vf driver requires 3 vectors and enable msi, will lead to guest stuck.Stuck how?Guest serial does not response anymore and guest network shutdown.quoted
quoted
Qemu gets number of interrupts from Multiple Message Capable field set by guest. This field is aligned to a power of 2(if a function requires 3 vectors, it initializes it to 2).So I guess this is a MultiMSI device with 4 vectors, right?Yes, it can support maximum of 32 msi interrupts, and vf driver only use 3 msi.quoted
quoted
However, guest driver just sends 3 mapi-cmd to vits and 3 ite entries is recorded in host. Vfio initializes msi interrupts using the number of interrupts 4 provide by qemu. When it comes to the 4th msi without ite in vits, in irq_bypass_register_producer, producer and consumer will __connect fail, due to find_ite fail, and do not resume guest.Let me rephrase this to check that I understand it: - The device has 4 vectors - The guest only create mappings for 3 of them - VFIO calls kvm_vgic_v4_set_forwarding() for each vector - KVM doesn't have a mapping for the 4th vector and returns an error - VFIO disable this 4th vector Is that correct? If yes, I don't understand why that impacts the guest at all. From what I can see, vfio_msi_set_vector_signal() just prints a message on the console and carries on.function calls: --> vfio_msi_set_vector_signal --> irq_bypass_register_producer -->__connect in __connect, add_producer finally calls kvm_vgic_v4_set_forwarding and fails to get the 4th mapping. When add_producer fail, it does not call cons->start, calls kvm_arch_irq_bypass_start and then kvm_arm_resume_guest.[+Eric, who wrote the irq_bypass infrastructure.] Ah, so the guest is actually paused, not in a livelock situation (which is how I interpreted "stuck"). I think we should handle this case gracefully, as there should be no expectation that the guest will be using this interrupt. Given that VFIO seems to be pretty unfazed when a producer fails, I'm temped to do the same thing and restart the guest. Also, __disconnect doesn't care about errors, so why should __connect have this odd behaviour? Can you please try this? It is completely untested (and I think the del_consumer call is odd, which is why I've also dropped it). Eric, what do you think?Adding Zhu, Jason, MST to the party. It all seems to be caused by this commit: commit a979a6aa009f3c99689432e0cdb5402a4463fb88 Author: Zhu Lingshan [off-list ref] Date: Fri Jul 31 14:55:33 2020 +0800 irqbypass: do not start cons/prod when failed connect If failed to connect, there is no need to start consumer nor producer. Signed-off-by: Zhu Lingshan [off-list ref] Suggested-by: Jason Wang [off-list ref] Link: https://lore.kernel.org/r/20200731065533.4144-7-lingshan.zhu@intel.com (local) Signed-off-by: Michael S. Tsirkin [off-list ref] Zhu, I'd really like to understand why you think it is OK not to restart consumer and producers when a connection has failed to be established between the two?
My bad, I didn't check ARM code but it's not easy to infer that the cons->start/stop is not a per consumer specific operation but a global one like VM halting/resuming.
In the case of KVM/arm64, this results in the guest being forever suspended and never resumed. That's obviously not an acceptable regression, as there is a number of benign reasons for a connect to fail.
Let's revert this commit. Thanks
Thanks, M.
_______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel