Thread (50 messages) 50 messages, 4 authors, 2014-03-21

Re: [PATCH 9/9] powerpc/pm: support deep sleep feature on T1040

From: Scott Wood <hidden>
Date: 2014-03-18 23:19:21
Also in: lkml

On Sun, 2014-03-16 at 12:58 +0800, Kevin Hao wrote:
On Fri, Mar 14, 2014 at 05:26:27PM -0500, Scott Wood wrote:
quoted
On Thu, 2014-03-13 at 15:46 +0800, Kevin Hao wrote:
quoted
On Wed, Mar 12, 2014 at 12:43:05PM -0500, Scott Wood wrote:
quoted
quoted
Shouldn't we use "readback, sync" here? The following is quoted form t4240RM:
  To guarantee that the results of any sequence of writes to configuration
  registers are in effect, the final configuration register write should be
  immediately followed by a read of the same register, and that should be
  followed by a SYNC instruction. Then accesses can safely be made to memory
  regions affected by the configuration register write.
I agree that the sync before the readback is probably not necessary,
since transactions to the same address should already be ordered.

A sync after the readback helps if you're trying to order the readback
with subsequent memory accesses, though in that case wouldn't a sync
alone (no readback) be adequate?
No, we don't just want to order the subsequent memory access here.
The 'write, readback, sync' is the required sequence if we want to make
sure that the writing to CCSR register does really take effect.
quoted
 Though maybe not always -- see the
comment near the end of fsl_elbc_write_buf() in
drivers/mtd/nand_fsl_elbc.c.  I guess the readback does more than just
make sure the device has seen the write, ensuring that the device has
finished the transaction to the point of acting on another one.
Agree.
quoted
The data dependency plus isync sequence, which is done by the normal I/O
accessors used from C code, orders the readback versus all future
instructions (not just I/O).  The delay loop is not I/O.
According to the PowerISA, the sequence 'load, date dependency, isync' only
order the load accesses. 
The point is to order the delay loop after the load, not to order
storage versus storage.
I think the point is to make sure that the writing of the CCSR_DDR_SDRAM_CFG_2
does really take effect before we begin to delay loop.
Yes.
 The sequence "write, readback, sync" will guarantee this according to the manual. 
If you're referring to the section you quoted above, the manual does not
say that.  It only talks about when accesses "to memory regions affected
by the configuration register write" can be safely made.
If we just want to
order the delay loop after the load, the following sequence should be enough:
	store to CCSR_DDR_SDRAM_CFG_2
	load from CCSR_DDR_SDRAM_CFG_2
	isync or sync
	delay loop

Why do we need the 'date dependency' here? According to the e6500 manual, the
instructions can execute out of order, but complete in order. So I am really
wondering why we need to order the load and the following delay loop if there
is no intention to order the storage access? 
The data (not date) dependency means that the twi will not execute until
after the load instruction returns data (thus, after the device has
responded to the load).  The twi, being a flow control instruction
rather than a storage instruction, should be fully ordered by isync.
From the isync description in the ISA: "Except as described in the
preceding sentence, the isync instruction may complete before storage
accesses associated with instructions preceding the
isync instruction have been performed."

I don't know if that really applies to loads (as opposed to stores), and
it probably doesn't apply to guarded loads, but I feel safer leaving the
twi in.

As for sync instead of isync, I see nothing to indicate that it would be
adequate (though it may be that the only reason there needed to be a
delay loop in the first place was the lack of a readback/sync before
doing other I/O, in which case this is moot).
quoted
This is a sequence we're already using on all of our I/O loads
(excluding accesses like in this patch that don't use the standard
accessors).  I'm confident that it works even if it's not
architecturally guaranteed.
This sequence is used to order the load and the followed storage access.
It's also used to order loads versus other things, such as enabling
MSR[EE].
And this is guaranteed by the architecture. But I don't think it is suitable
for a case like this. The following is quoted from PowerISA:
  Because stores cannot be performed “out-of-order”
  (see Book III), if a Store instruction depends on the
  value returned by a preceding Load instruction
  (because the value returned by the Load is used to
  compute either the effective address specified by the
  Store or the value to be stored), the corresponding stor-
  age accesses are performed in program order. The
  same applies if whether the Store instruction is exe-
  cuted depends on a conditional Branch instruction that
  in turn depends on the value returned by a preceding
  Load instruction.
  
  Because an isync instruction prevents the execution of
  instructions following the isync until instructions pre-
  ceding the isync have completed, if an isync follows a
  conditional Branch instruction that depends on the
  value returned by a preceding Load instruction, the
  load on which the Branch depends is performed before
  any loads caused by instructions following the isync.

I think the above description would guarantee that the load will be performed
before any storage access (both load and store) following the isync in the
following scenario:
	lwz	r4, 0(r3)
	twi	0, r4, 0
	isync
I think it's a poorly worded section, but yes, it guarantees both loads
and stores -- unnecessarily doing so in separate places with different
wording.  But by the definition of isync I don't see why it does not
apply to *any* instruction after the isync, not just loads and stores.
quoted
 I'm not sure that there exists a clear
architectural way of synchronizing non-storage instructions relative to
storage instructions.
Isn't what the execution synchronization instructions such as sync, isync, mtmsr
do?
No.  Execution synchronizing just says that the previous instructions
"have completed to a point at which they have reported all the
exceptions they will cause" (I'm assuming this doesn't include machine
check error reports), and that the previous instructions won't be
affected by context changes after the execution synchronizing
instruction, not that the previous instructions are fully completed.
 
quoted
Given that isync is documented as preventing any execution of
instructions after the isync until all previous instructions complete,
it doesn't seem to make sense for the architecture to explicitly talk
about loads (as opposed to any other instruction) following a load,
dependent conditional branch, isync sequence.
Sorry, I didn't get what you mean.
I mean that I do not understand why it says, "the load on which the
Branch depends is performed before any loads caused by instructions
following the isync" rather than "the Load on which the Branch depends
is performed before any instructions following the isync".
quoted
quoted
So if we want to order all the storage access as well
as execution synchronization, we should choose sync here.
Do we need execution synchronization or context synchronization?
There is no context-altering instruction here, so I think an execution
synchronizing instruction should be enough here.
Is the ISA ever explicit about what constitutes "context"?  In any case,
just because we don't need that aspect of context synchronization
doesn't mean execution synchronization is enough.  The behavior we want
is described in the isync instruction specifically, not in "context
synchronization" or "execution synchronization".
 
quoted
The t4240 RM section that talks about a readback and a sync is in the
context of subsequent memory operations ("Then accesses can safely be
made to memory regions affected..."), not arbitrary instructions.
I assume that this sequence also guarantee that the writing does take effect.
You may assume that, but the manual doesn't say that.  Sync could
(AFAIK) be implemented by emitting a barrier on the bus, or delaying
future storage accesses, rather than waiting for everything to have
finished before proceeding.
quoted
In any case, this is not performance critical and thus it's better to
oversynchronize than undersynchronize.
On the contrary I think that sync is oversynchronize instead of
undersynchronize. It not only provide the execution synchronizing but also
order all the storage accesses. That is why I prefer the sync. :-)
sync is not a superset of isync.  isync is not a superset of sync.  If
you want to oversynchronize to be safe, do both (though even that isn't
equivalent to a barrier that orders everything, which is partly why a
readback is needed).

-Scott
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help