Thread (1 message) 1 message, 1 author, 2014-06-25

Re: OF_DYNAMIC node lifecycle

From: Grant Likely <hidden>
Date: 2014-06-25 20:24:46
Also in: linuxppc-dev

Possibly related (same subject, not in this thread)

On Tue, 24 Jun 2014 15:10:55 -0500, Nathan Fontenot [off-list ref] wrote:
On 06/23/2014 09:48 AM, Grant Likely wrote:
quoted
On Thu, 19 Jun 2014 10:26:15 -0500, Nathan Fontenot [off-list ref] wrote:
quoted
On 06/18/2014 03:07 PM, Grant Likely wrote:
quoted
Hi Nathan and Tyrel,

I'm looking into lifecycle issues on nodes modified by OF_DYNAMIC, and
I'm hoping you can help me. Right now, pseries seems to be the only
user of OF_DYNAMIC, but making OF_DYNAMIC work has a huge impact on
the entire kernel because it requires all DT code to manage reference
counting with iterating over nodes. Most users simply get it wrong.
Pantelis did some investigation and found that the reference counts on
a running kernel are all over the place. I have my doubts that any
code really gets it right.

The problem is that users need to know when it is appropriate to call
of_node_get()/of_node_put(). All list traversals that exit early need
an extra call to of_node_put(), and code that is searching for a node
in the tree and holding a reference to it needs to call of_node_get().

I've got a few pseries questions:
- What are the changes being requested by pseries firmware? Is it only
CPUs and memory nodes, or does it manipulate things all over the tree?
The short answer, everything.
:-)
quoted
For pseries the two big actions that can change the device tree are
adding/removing resources and partition migration.

The most frequent updates to the device tree happen during resource
(cpu, memory, and pci/phb) add and remove. During this process we add
and remove the node and its properties from the device tree.
- For memory on newer systems this just involves updating the
  ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory property. Older
  firmware levels add and remove the memroy@XXX nodes and their properties.
- For cpus the cpus/PowerPC,POWERXXXX nodes and its properties are added
  or removed
- For pci/phb the pci@XXXXX nodes and properties are added/removed.

The less frequent operation of live partition migration (and suspend/resume)
can update just about anything in the device tree. When this occurs and the
systems starts after being migrated (or waking up after a suspend) we make
a call to firmware to get updates to the device tree for the new hardware
we are running on.
 
quoted
- How frequent are the changes? How many changes would be likely over
the runtime of the system?
This can happen frequently.
Thanks, that is exactly the information that I want. I'm not so much
concerned with the addition or removal of nodes/properties, which is
actually pretty easy to handle. It is the lifecycle of allocations on
dynamic nodes that causes heartburn.
quoted
quoted
- Are you able to verify that removed nodes are actually able to be
freed correctly? Do you have any testcases for node removal?
I have always tested this by doing resource add/remove, usually cpu and memory
since it is the easiest.
Is that just testing the functionality, or do you have tests that check
if the memory gets freed?
In general it's just functionality testing.
quoted
quoted
quoted
I'm thinking very seriously about changing the locking semantics of DT
code entirely so that most users never have to worry about
of_node_get/put at all. If the DT code is switched to use rcu
primitives for tree iteration (which also means making DT code use
list_head, something I'm already investigating), then instead of
trying to figure out of_node_get/put rules, callers could use
rcu_read_lock()/rcu_read_unlock() to protect the region that is
searching over nodes, and only call of_node_get() if the node pointer
is needed outside the rcu read-side lock.
This sounds good. I like just taking the rcu lock around accessing the DT.
Do we have many places where DT node pointers are held that require
keeping the of_node_get/put calls? If this did exist perhaps we could
update those places to look up the DT node every time instead of
holding on to the pointer. We could just get rid of the reference counting
altogether then.
There are a few, but I would be happy to restrict reference counting to
only those locations. Most places will decode the DT data, and then
throw away the reference. We /might/ even be able to do rcu_lock/unlock
around the entire probe path which would make it transparent to all
device drivers.
quoted
quoted
I'd really like to be rid of the node reference counting entirely, but
I can't figure out a way of doing that safely, so I'd settle for
making it a lot easier to get correct.
heh! I have often thought about adding reference counting to device tree
properties.
You horrible, horrible man.
Yes. I are evil :)

After looking again the work needed to add reference counts to properties
would be huge. The few properties I am concerned with are specific to powerpc
so perhaps just adding an arch specific lock around updating those
properties would work.
Which code/properties? I'd like to have a look myself.

g.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help