Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported
From: Alex Williamson <hidden>
Date: 2016-05-06 16:55:24
Also in:
kvm, linux-iommu, linux-pci, lkml
On Fri, 6 May 2016 16:35:38 +1000 Alexey Kardashevskiy [off-list ref] wrote:
On 05/06/2016 01:05 AM, Alex Williamson wrote:quoted
On Thu, 5 May 2016 12:15:46 +0000 "Tian, Kevin" [off-list ref] wrote:quoted
quoted
From: Yongji Xie [mailto:xyjxie@linux.vnet.ibm.com] Sent: Thursday, May 05, 2016 7:43 PM Hi David and Kevin, On 2016/5/5 17:54, David Laight wrote:quoted
From: Tian, Kevinquoted
Sent: 05 May 2016 10:37...quoted
quoted
Acutually, we are not aimed at accessing MSI-X table from guest. So I think it's safe to passthrough MSI-X table if we can make sure guest kernel would not touch MSI-X table in normal code path such as para-virtualized guest kernel on PPC64.Then how do you prevent malicious guest kernel accessing it?Or a malicious guest driver for an ethernet card setting up the receive buffer ring to contain a single word entry that contains the address associated with an MSI-X interrupt and then using a loopback mode to cause a specific packet be received that writes the required word through that address. Remember the PCIe cycle for an interrupt is a normal memory write cycle. DavidIf we have enough permission to load a malicious driver or kernel, we can easily break the guest without exposed MSI-X table. I think it should be safe to expose MSI-X table if we can make sure that malicious guest driver/kernel can't use the MSI-X table to break other guest or host. The capability of IRQ remapping could provide this kind of protection.With IRQ remapping it doesn't mean you can pass through MSI-X structure to guest. I know actual IRQ remapping might be platform specific, but at least for Intel VT-d specification, MSI-X entry must be configured with a remappable format by host kernel which contains an index into IRQ remapping table. The index will find a IRQ remapping entry which controls interrupt routing for a specific device. If you allow a malicious program random index into MSI-X entry of assigned device, the hole is obvious... Above might make sense only for a IRQ remapping implementation which doesn't rely on extended MSI-X format (e.g. simply based on BDF). If that's the case for PPC, then you should build MSI-X passthrough based on this fact instead of general IRQ remapping enabled or not.I don't think anyone is expecting that we can expose the MSI-X vector table to the guest and the guest can make direct use of it. The end goal here is that the guest on a power system is already paravirtualized to not program the device MSI-X by directly writing to the MSI-X vector table. They have hypercalls for this since they always run virtualized. Therefore a) they never intend to touch the MSI-X vector table and b) they have sufficient isolation that a guest can only hurt itself by doing so. On x86 we don't have a), our method of programming the MSI-X vector table is to directly write to it. Therefore we will always require QEMU to place a MemoryRegion over the vector table to intercept those accesses. However with interrupt remapping, we do have b) on x86, which means that we don't need to be so strict in disallowing user accesses to the MSI-X vector table. It's not useful for configuring MSI-X on the device, but the user should only be able to hurt themselves by writing it directly. x86 doesn't really get anything out of this change, but it helps this special case on power pretty significantly aiui. Thanks,Excellent short overview, saved :) How do we proceed with these patches? Nobody seems objecting them but also nobody seems taking them either...
Well, this series is still based on some non-upstream patches, so... Once that dependency is resolved this series should probably be split into functional areas for acceptance by the appropriate subsystem maintainers.