Thread (19 messages) 19 messages, 3 authors, 2014-09-24

Re: [3/5] pseries: Create device hotplug entry point

From: Nathan Fontenot <hidden>
Date: 2014-09-23 14:43:21

On 09/22/2014 08:15 PM, Tyrel Datwyler wrote:
On 09/17/2014 12:15 PM, Nathan Fontenot wrote:
quoted
On 09/17/2014 02:07 AM, Michael Ellerman wrote:
quoted
On Mon, 2014-09-15 at 15:31 -0500, Nathan Fontenot wrote:
quoted
For pseries system the kernel will be notified of hotplug requests in
the form of rtas hotplug events. 
Can you flesh that design out a bit for me, I don't entirely get how it's going
to work.

The kernel gets the rtas hotplug events (in rtasd.c) and spits them out to
userspace, which then writes them back in ?
quoted
This patch creates a common routine that can handle these requests in both
the PowerVM anbd PowerKVM environments, handle_dlpar_errorlog(). This also
                ^
quoted
creates the initial memory hotplug request handling stub.

For PowerVM this patch also creates a new /proc file that the drmgr
command will use to write rtas hotplug events to.
Why is this different between phyp and KVM?
quoted
For future PowerKVM handling the rtas check-exception code can pass
any rtas hotplug events received to handle_dlpar_errorlog().
Internally to the kernel you mean?
Perhaps a better explanation of how things work today and where I see
them going is needed. I was trying to avoid a long explanation and I
don't think my shortened explanation worked. I'll include this in v2
of the patchset too.

The current hotplug (or dlpar) of devices (the process is generally the
same for memory, cpu, and pci) on PowerVM systems is initiated
from the HMC, which communicates the request to the partitions through
the RSCT framework. The RSCT framework then invokes the drmgr command.
The drmgr command performs the hotplug operation by doing some pieces,
such as most of the rtas calls and device tree parsing, in userspace
and make requests to the kernel to online/offline the device, update the
device tree and add/remove the device.

For PowerKVM the approach is to follow what is currently being done for
pci hotplug. A hotplug request is initiated from the host. QEMU then
sends an EPOW interrupt to the guest which causes the guest to make the
rtas,check-exception call. In QEMU, the rtas,check-exception call
returns a rtas hotplug event to the guest. I was using this same framework
to also enable memory (and next cpu) hotplug.

You are correct that the current pci hotplug path for PowerKVM involves
the kernel receiving the rtas event, passing it to rtas_errd in userspace,
and having rtas_errd invoke drmgr. The drmgr command then handles the request
as described above for PowerVM systems.

There is no need for this circuitous route, we should just handle the entire
hotplug of devices in the kernel. What I am hoping to do is to enable this
by moving the code to handle hotplug from drmgr into the kernel and 
provide a single path for handling hotplug for PowerVM and PowerKVM. To
make this work for PowerKVM we will update the kernel rtas code to
recognize rtas hotplug events returned from rtas,check-exception calls
and call handle_dlpar_errorlog(). The hotplug rtas event is never sent out
to userspace.
Wouldn't we still want the event surfaced to userspace so that it can at
least be logged?
The only logging of hotplug/dlpar events we do is putting a notification
iv /var/log/messages. This is done today by the drmgr command.

I can add a pr_info message to log the hotplug/dlpar request and it's
success/failure.

Also, I believe one of the longer term goals is to not require the rtas_errd
daemon for PowerKVM.

-Nathan
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help