Thread (46 messages) 46 messages, 4 authors, 2012-08-15

Re: [PATCH v7 2/2] kvm: KVM_EOIFD, an eventfd for EOIs

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: 2012-08-13 22:59:06
Also in: lkml

On Mon, Aug 13, 2012 at 04:41:05PM -0600, Alex Williamson wrote:
On Tue, 2012-08-14 at 01:06 +0300, Michael S. Tsirkin wrote:
quoted
On Mon, Aug 13, 2012 at 03:34:01PM -0600, Alex Williamson wrote:
quoted
On Sun, 2012-08-12 at 11:36 +0300, Avi Kivity wrote:
quoted
On 08/09/2012 10:26 PM, Alex Williamson wrote:
quoted
On Mon, 2012-08-06 at 13:40 +0300, Avi Kivity wrote:
quoted
On 08/06/2012 01:38 PM, Avi Kivity wrote:
quoted
Regarding the implementation, instead of a linked list, would an array
of counters parallel to the bitmap make it simpler?
Or even, replace the bitmap with an array of counters.
I'm not sure a counter array is what we're really after.  That gives us
reference counting for the irq source IDs, but not the key->gsi lookup.
You can look up the gsi while registering the eoifd, so it's accessible
as eoifd->gsi instead of eoifd->source->gsi.  The irqfd can go away
while the eoifd is still active, but is this a problem?
In my opinion, no, but Michael disagrees.
quoted
quoted
It also highlights another issue, that we have a limited set of source
IDs.  Looks like we have BITS_PER_LONG IDs, with two already used, one
for the shared userspace ID and another for the PIT.  How happy are we
going to be with a limit of 62 level interrupts in use at one time?
When we start being unhappy we can increase that number.  On the other
hand more locks and lists makes me unhappy now.
Yep, good point.  My latest version removes the source ID object lock
and list (and objects).  I still have a lock and list for the ack
notification, but it's hard not to unless we combine them into one
mega-irqfd ioctl as Michael suggests.
quoted
quoted
It's arguably a reasonable number since the most virtualization friendly
devices (sr-iov VFs) don't even support this kind of interrupt.  It's
also very wasteful allocating an entire source ID for a single GSI
within that source ID.  PCI supports interrupts A, B, C, and D, which,
in the most optimal config, each go to different GSIs.  So we could
theoretically be more efficient in our use and allocation of irq source
IDs if we tracked use by the source ID, gsi pair.
There are, in one userspace, just three gsis available for PCI links, so
you're compressing the source id space by 3.
I imagine there's a way to put each PCI interrupt pin on a GSI, but
still only 4, not a great expansion of source ID space.  I like
Michael's idea of re-using source IDs if we run out better.
quoted
quoted
That probably makes it less practical to replace anything at the top
level with a counter array.  The key that we pass back is currently the
actual source ID, but we don't specify what it is, so we could split it
and have it encode a 16bit source ID plus 16 bit GSI.  It could also be
an idr entry.
We can fix those kinds of problems by adding another layer of
indirection.  But I doubt they will be needed.  I don't see people
assigning 60 legacy devices to one guest.
Yep, we can ignore it for now and put it in the hands of userspace to
re-use IDs if needed.
quoted
quoted
Michael, would the interface be more acceptable to you if we added
separate ioctls to allocate and free some representation of an irq
source ID, gsi pair?  For instance, an ioctl might return an idr entry
for an irq source ID/gsi object which would then be passed as a
parameter in struct kvm_irqfd and struct kvm_eoifd so that the object
representing the source id/gsi isn't magically freed on it's own.  This
would also allow us to deassign/close one end and reconfigure it later.
Thanks,
Another option is to push the responsibility for allocating IDs for the
association to userspace.  Let userspace both create the irqfd and the
eoifd with the same ID, the kernel matches them at registration time and
copies the gsi/sourceid from the first to the second eventfd.
Aside from the copying gsi/sourceid bit, you've just described my latest
attempt at this series.  Specifying both a sourceid and gsi also allows
userspace to make better use of the sourceid address space (use more
than one gsi if userspace wants the complexity of managing them).
Thanks,

Alex
Turns out per device source ID is a bug copied from existing
device assignment. I am amazed we did not notice before.
There we have small # of devices so it's not a problem but there's no
reason just not to have a source ID for all irqfds.
So the problem goes away, and there is no limit on # of level irqfds,
and no need to manage IDs in userspace at all.
You can still have cookies in userspace if you like but do not map them
to source IDs.
IMHO it's not a bug, it's an implementation decision.  They could be
shared, but that doesn't make it wrong to not share them.  Given that we
have 32 memory slots, the only way you could hit this would be to have a
lot of really slow devices that don't direct-map any BARs.  A reason to
not have the same source id for everything is that I think we can do ack
notification filtering more easily using separate source ids (as is done
in the first patch of the v8 series).
Just a thought: can filtering read and clear the irqfd counter?
 As the code is today, I agree,
there's probably no advantage to using multiple source IDs.  Thanks,

Alex
I think one point worth addressing is, Gleb wanted
to get eoifd without irqfd at all and that works for
timer interrupt.

-- 
MST
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help