Thread (36 messages) 36 messages, 8 authors, 2010-11-29

Re: Pegasos OHCI bug (was Re: PROBLEM: memory corrupting bug,

From: Segher Boessenkool <hidden>
Date: 2010-10-28 19:50:29
Also in: lkml

So is it wrong to leave the host controller enabled when the OS is booted?
Yes.  Or, rather, there should be some way for the client to turn off
all dma and interrupt activity; if the client closes the ihandles in
"/chosen", and perhaps calls "quiesce", that should be enough.


If
not, then the error must be in the communication of which memory addresses
are in use by OF. I've got a node /memory@0 whose "available" property
looks
like this:
 00000000 00400000
 00584000 0007c000
 0092a1d8 00004e28
 00a2f000 005d1000
 01800000 0e3fd000
 0fbffab4 0000054c
quoted
From that list, it looks to me like OF is telling the kernel that it
should
not attempt to use any address above 0xfbffab4+0x54c == 0xfc00000.
The client is allowed to "take over" all memory, if it doesn't call OF
after doing so.  This won't work if some device scribbles on it, as
you have seen.
Later, when the kernel decides it's done using OF, what's supposed to
happen?
It closes stdin, but that doesn't help here since the offending device is
a
bus node, not an input node. It looks to me like the kernel makes the
assumption that all devices other than stdin and stdout will have been
deactivated already when the kernel starts, and that this assumption has
been violated. Who is wrong, from the perspective of the OF standard, the
assumer or the violator?
The violator.
quoted
Lovely, incorrect data (it should start with 82002810, i.e.,
not relocatable -- it is already an assigned address!).
Now you see how I have trouble relating the docs to the reality...
Yeah :-(
quoted
This means: 32-bit MMIO address space for bus 0 dev 5 fn 0,
first BAR; assigned to address 80000000; size is 1000.
But "address 80000000" is a physical address (I think), so do I need to do
a
map-in on it before using it?
Yes.
quoted
You could try a boot script like this:


dev /pci
0 ffff04 DO 0 i config-w! -100 +LOOP
device-end


which should disable all PCI devices on all busses, on that
Almost all of my devices are under that PCI node. What will I prove by
disabling them?
You should put it after "load", and before "go".

It should give you a working system; it's a sledgehammer workaround.


Segher
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help