Re: [PATCH] powerpc/pseries/mobility: ignore ibm,platform-facilities updates
From: Nathan Lynch <hidden>
Date: 2021-10-20 15:55:53
Tyrel Datwyler [off-list ref] writes:
On 10/19/21 2:36 PM, Nathan Lynch wrote:quoted
Tyrel Datwyler [off-list ref] writes:quoted
On 10/18/21 9:34 AM, Nathan Lynch wrote:quoted
On VMs with NX encryption, compression, and/or RNG offload, these capabilities are described by nodes in the ibm,platform-facilities device tree hierarchy: $ tree -d /sys/firmware/devicetree/base/ibm,platform-facilities/ /sys/firmware/devicetree/base/ibm,platform-facilities/ ├── ibm,compression-v1 ├── ibm,random-v1 └── ibm,sym-encryption-v1 3 directories The acceleration functions that these nodes describe are not disrupted by live migration, not even temporarily. But the post-migration ibm,update-nodes sequence firmware always sends "delete" messages for this hierarchy, followed by an "add" directive to reconstruct it via ibm,configure-connector (log with debugging statements enabled in mobility.c): mobility: removing node /ibm,platform-facilities/ibm,random-v1:4294967285 mobility: removing node /ibm,platform-facilities/ibm,compression-v1:4294967284 mobility: removing node /ibm,platform-facilities/ibm,sym-encryption-v1:4294967283 mobility: removing node /ibm,platform-facilities:4294967286 ... mobility: added node /ibm,platform-facilities:4294967286It always bothered me that the update from firmware here was an delete/add at the entire '/ibm,platform-facilities' node level instead of update properties calls. When I asked about it years ago the claim was it was just easier to pass an entire sub-tree as a single node add.Yes, I believe this is for ease of implementation on their part.I'd still argue that a full node removal or addition is a platform reconfiguration on par with a hotplug operation.
Well... I think we might be better served by a slightly more flexible view of things :-) These are just a series of messages from firmware that should be considered as a whole, and they don't dictate exactly what the OS must do to its own data structures. Nothing mandates that the OS immediately and literally apply the changes as they are individually reported by the update-nodes sequence. Anyway, whether the firmware behavior is reasonable is sort of beside the point since we're stuck with it.
quoted hunk ↗ jump to hunk
quoted
quoted
quoted
This is because add_dt_node() -> dlpar_attach_node() attaches only the parent node returned from configure-connector, ignoring any children. This should be corrected for the general case, but fixing that won't help with the stale OF node references, which is the more urgent problem.I don't follow. If the code path you mention is truly broken in the way you say then DLPAR operations involving nodes with child nodes should also be broken.Hmm I don't know of any kernel-based DLPAR workflow that attaches sub-trees i.e. parent + children. Processor addition attaches CPU nodes and possibly cache nodes, but they have sibling relationships. If you're thinking of I/O adapters, the DT updates are (still!) driven from user space. As undesirable as that is, that use case actually works correctly AFAIK. What happens for the platfac LPM scenario is that dlpar_configure_connector() returns the node pointer for the root ibm,platform-facilities node, with children attached. That node pointer is passed from add_dt_node() -> dlpar_attach_node() -> of_attach_node() -> __of_attach_node(), where its child and sibling fields are overwritten in the process of attaching it to the tree.Well it worked back in 2013 when I got all the in kernel device tree update stuff working. Looks like I missed this one which now explains one area where platform-facilities update originally went to shit. commit 6162dbe49a451f96431a23b4821f05e3bd925bc1 Author: Grant Likely [off-list ref] Date: Wed Jul 16 08:48:46 2014 -0600 of: Make sure attached nodes don't carry along extra children The child pointer does not get cleared when attaching new nodes which could cause the tree to be inconsistent. Clear the child pointer in __of_attach_node() to be absolutely sure that the structure remains in a consistent layout. Signed-off-by: Grant Likely [off-list ref]diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c index c875787fa394..b96d83100987 100644 --- a/drivers/of/dynamic.c +++ b/drivers/of/dynamic.c@@ -98,6 +98,7 @@ int of_property_notify(int action, struct device_node *np, void __of_attach_node(struct device_node *np) { + np->child = NULL; np->sibling = np->parent->child; np->allnext = np->parent->allnext; np->parent->allnext = np;Not sure what the reasoning was here since this prevents attaching a node that is a sub tree. Either need to revert that, rewrite dlpar_attach_node to iterate over the sub-tree, or probably best rewrite dlpar_configure_connector to use a of_changeset instead.
Good find. I'd guess that adding a subtree used to sort of work, except that children of np were not added to the allnext list, which would effectively hide them from some lookups. Regardless, yes, the pseries code needs to change to attach and detach nodes individually. I don't think the OF core supports more complex changes.