Thread (135 messages) 135 messages, 18 authors, 2007-03-20

Re: [patch 13/26] Xen-paravirt_ops: Consistently wrap paravirt ops callsites to make them patchable

From: Jeremy Fitzhardinge <hidden>
Date: 2007-03-16 19:17:10
Also in: lkml, xen-devel

Ingo Molnar wrote:
* David Miller [off-list ref] wrote:
quoted
Perhaps the problem can be dealt with using ELF relocations.

There is another case, discussed yesterday on netdev, where run-time 
resolution of ELF relocations would be useful (for 
very-very-very-read-only variables) so if it can solve this problem 
too it would be nice to have a generic infrastructure for it.
  
yeah, and i really think this is very fundamental: [...]
I think what Dave is suggesting is that we use the reloc information the
compiler generates to find the patchable callsites rather than have
special wrappers.  This is an interesting idea.
Limited, instruction-level patching like alternatives.h is fine because 
that makes it easier to support multiple, incompatible CPU 
architectures, without having to do a hugely intrusive split at the 
kernel RPM level.

but the level of 'binary patching' done by the paravirt and Xen goes way 
beyond that,
Not really.  There are only three cases:

   1. replace an indirect call with a direct call
   2. nop out a callsite
   3. patch in a short inline sequence

And as I pointed out, this is used by all pv_op backends, using a common
piece of code to implement at least 1 and 2.  3 could be implemented
semi-generically by using rules like "if (func == native_sti) {
patch("sti"); }", which would cover many cases where a hypervisor
doesn't need any special handling for a particular operation.

The goal is to eliminate the cost of the indirect calls with nice
predictable indirect calls.  There's a 1 byte/callsite overhead, but I
don't think that's a horrible overhead.

And, at worst, its only a little more complex than the kinds of
transformations.

Ideally, its a mechanism which could be used elsewhere.  It applies with
you have some kind of ops_vector table which is updated once (or perhaps
very rarely), and you don't want to wear the overhead of indirect calls
everywhere.
 and the changes here really underscore that we:

  _should not emulate the closed source world_

There the only solution is to binary-patch - because they have no source 
code. But here, we've got all the source code.
  
I don't think this is a relevant comparison.  This is purely a matter of
optimising out unnecessary indirect calls.
nobody wants to boot a xen-paravirt kernel from a floppy, so image size 
is not an issue. In-RAM overhead would in fact be /reduced/, because 
currently all the paravirt overhead hits both the native and the 
paravirt kernel. Nor would /all/ of the vmlinuz have to be replicated in 
the images - it's enough to replicate only those functions that truly 
differ between the two build methods.
One of the explicit goals of pv_ops was to allow a single kernel to
either boot on native hardware or under any one of the supported
hypervisors, explicitly to avoid having to manage multiple kernel
images.  Compiling the kernel N+1 times for N hypervisors, and then
bundling them up in some kind of multi-image format doesn't seem like a
particularly good tradeoff.  The kernel RPM on my machine here is
already ~50Mbytes; expanding that to 250Mbytes to support native, Xen,
vmi, lguest and kvm doesn't seem reasonable.

    J
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help