Thread (20 messages) 20 messages, 3 authors, 2013-03-21

Re: [PATCH 2/3] VFIO: VFIO_DEVICE_SET_ADDR_MAPPING command

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: 2013-03-16 05:37:42
Also in: kvm

On Sat, 2013-03-16 at 09:34 +0800, Gavin Shan wrote:
quoted
Could you explain further how this will be used?  How the device is
exposed to a guest is entirely a userspace construct, so why does vfio
need to know or care about this?  I had assumed for AER that QEMU would
do the translation from host to guest address space.
The weak IOCTL function (vfio_pci_arch_ioctl) was introduced by previous
patch. The PowerNV platform is going to override it to figure out the
information for EEH core to use. On the other hand, QEMU will runs into
the IOCTL command while opening (creating) one VFIO device.

Though I'm not familiar with AER very much. AER is quite different from
EEH. The EEH functionality implemented in PHB instead of in PCI device
core. So we don't care AER stuff in EEH directly :-)
To give Alex a bit more background...

EEH is our IBM specific error handling facility which is a superset of AER.

IE. In addition to AER's error detection and logging, it adds a layer of
error detection at the host bridge level (such as iommu violations etc...)
and a mechanism for handling and recovering from errors. This is tied to
our iommu domain stuff (our PE's) and our device "freezing" capability
among others.

With VFIO + KVM, we want to implement most of the EEH support for guests in
the host kernel. The reason is multipart and we can discuss this separately
as some of it might well be debatable (mostly it's more convenient that way
because we hook into the underlying HW/FW EEH which isn't directly userspace
accessible so we don't have to add a new layer of kernel -> user API in
addition to the VFIO stuff), but there's at least one aspect of it that drives
this requirement more strongly which is performance:

When EEH is enabled, whenever any MMIO returns all 1's, the kernel will do
a firmware call to query the EEH state of the device and check whether it
has been frozen. On some devices, that can be a performance issue, and
going all the way to qemu for that would be horribly expensive.

So we want at least a way to handle that call in the kernel and for that we
need at least some way of mapping things there.

Cheers,
Ben.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help