Thread (26 messages) 26 messages, 8 authors, 2010-07-30

Re: [PATCH V4] powerpc/prom: Export device tree physical address via proc

From: Grant Likely <hidden>
Date: 2010-07-15 18:37:55

On Thu, Jul 15, 2010 at 12:03 PM, Matthew McClintock [off-list ref] wr=
ote:
On Thu, 2010-07-15 at 10:57 -0600, Grant Likely wrote:
quoted
On Thu, Jul 15, 2010 at 10:39 AM, Matthew McClintock [off-list ref]=
 wrote:
quoted
quoted
On Thu, 2010-07-15 at 10:22 -0600, Grant Likely wrote:
quoted
quoted
Thanks for taking a look. My first thought was to just blow away al=
l
quoted
quoted
quoted
the
quoted
memreserve regions and start over. But, there are reserve regions
for
quoted
other things that I might not want to blow away. For example, on
mpc85xx
quoted
SMP systems we have an additional reserve region for our boot page.
What is your starting point? =A0Where does the device tree (and
memreserve list) come from
that you're passing to kexec? =A0My first impression is that if you h=
ave
quoted
quoted
quoted
to scrub the memreserve list, then the source being used to
obtain the memreserves is either faulty or unsuitable to the task.
I'm pulling the device tree passed in via u-boot and passing it to
kexec.
How? =A0(what mechanism?) =A0I hope you're not using the debugfs
flat-device-tree file.
That is one way to get a good working copy. What is wrong with this
mechanism?
It's unstable.  It is in the debugfs, so there are no guarantees that
the ABI will remain the same.  Plus it doesn't reflect any changes
that the kernel may make to the device tree.  That interface is *debug
only*.  Do not use it.
Should we duplicate everything u-boot does in kexec to build up a flat
device tree? Or is there another way to get a good tree?
That is one option.  U-Boot really shouldn't be modifying the tree
very much anyway (I know on some platforms U-Boot is almost creating a
tree from scratch, but that is insane and an entirely different
discussion).  /proc/device-tree always gives the kernel's current view
of the tree.  You can use dtc to extract it and write it into a dtb.
Ideally, we
don't make the end user manually edit a device tree.
Of course not, any device tree manipulation is the job of the kexec
tools.  None of this should be manual.  However, the data source is a
significant and important question.
quoted
quoted
It is the most complete device tree and requires the least amount
of fixup.

I have to scrub two items, the ramdisk/initrd and the device tree
because upon kexec'ing the kernel we have the ability to pass in new
ramdisk/initrd and device tree. They can also live at different physic=
al
quoted
quoted
addresses for the second reboot.
This sounds like the model is backwards. =A0Rather than scrubbing items,
the memreserve list should be built up from a known good source.
You can build one up yourself and it will still work out fine. Or you
can pull one from debugfs to get yourself started. Or you can pull it
every time.
What do you mean by "pull it every time"?

Out of curiosity, what is responsible for building up the memreserve
list?  The userspace portion, or the kernel portion of kexec?  Or is
it done by a totally separate program?
quoted
quoted
The initrd addresses are already exposed, so we can update/remove/reus=
e
quoted
quoted
that entry, we just need a way for kexec to determine the current devi=
ce
quoted
quoted
tree address so it can replace the correct memreserve region for the
kexec'ing kernels' device tree.

The whole problem comes from repeatedly kexec'ing, we need to make sur=
e
quoted
quoted
we don't keep losing blobs of memory to reserve regions (so we can't
just blindly add). We also need to make sure we don't lose other
memreserve regions that might be important for other things (so we can=
't
quoted
quoted
just blow them all away).
Right, so you need to have a known-good list of reserve sections.
Trying to go the other way sounds very fragile.
Yes. Where would we get a list of memreserve sections?
I would say the list of reserves that are not under the control of
Linux should be explicitly described in the device tree proper.  For
instance, if you have a region that firmware depends on, then have a
node for describing the firmware and a property stating the memory
regions that it depends on.  The memreserve regions can be generated
from that.
Should we export
the reserve sections instead of the device tree location?
It shouldn't really be something that the kernel is explicitly
exporting because it is a characteristic of the board design.  It is
something that belongs in the tree-proper.  ie. when you extract the
tree you have data telling what the region is, and why it is reserved.
We just need a
way to preserve what was there at boot to pass to the new kernel.
Yet there is no differentiation between the board-dictated memory
reserves and the things that U-Boot/Linux made an arbitrary decision
on.  The solution should focus not on "can I throw this one away?" but
rather "Is this one I should keep?"  :-)  A subtle difference, I know,
but it changes the way you approach the solution.

Cheers,
g.

--=20
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help