Thread (126 messages) 126 messages, 14 authors, 2018-04-02

Re: RFC on writel and writel_relaxed

From: Jason Gunthorpe <jgg@ziepe.ca>
Date: 2018-03-29 14:45:25
Also in: linux-rdma

On Thu, Mar 29, 2018 at 10:19:41AM +0100, Will Deacon wrote:
On Wed, Mar 28, 2018 at 10:57:32AM -0600, Jason Gunthorpe wrote:
quoted
On Wed, Mar 28, 2018 at 11:13:45AM +0100, Will Deacon wrote:
quoted
On Wed, Mar 28, 2018 at 09:01:27PM +1100, Benjamin Herrenschmidt wrote:
quoted
On Wed, 2018-03-28 at 11:55 +0200, Arnd Bergmann wrote:
quoted
quoted
powerpc and ARM can't quite make them synchronous I think, but at least
they should have the same semantics as writel.
One thing that ARM does IIRC is that it only guarantees to order writel() within
one device, and the memory mapped PCI I/O space window almost certainly
counts as a separate device to the CPU.
That sounds bogus.
To elaborate, if you do the following on arm:

	writel(DEVICE_FOO);
	writel(DEVICE_BAR);

we generally cannot guarantee in which order those accesses will hit the
devices even if we add every barrier under the sun. You'd need something
in between, specific to DEVICE_FOO (probably a read-back) to really push
the first write out. This doesn't sound like it would be that uncommon to
me.
The PCI posted write does not require the above to execute 'in order'
only that any bus segment shared by the two devices have the writes
issued in CPU order. ie at a shared PCI root port for instance.

If I recall this is very similar to the ordering that ARM's on-chip
AXI interconnect is supposed to provide.. So I'd be very surprised if
a modern ARM64 has an meaningful difference from x86 here.
From the architectural perspective, writes to different "peripherals" are
not ordered with respect to each other. The first writel will complete once
it gets its write acknowledgement, but this may not necessarily come from
the endpoint -- it could come from an intermediate buffer past the point of
serialisation (i.e. the write will then be ordered with respect to other
accesses to that same endpoint). The PCI root port would look like one
peripheral here.
That is basically the same as PCI - PCI has no write ACK, so all
writes are buffered by the PCI interconnect and complete in some
undefined temporal order when multiple end points are involved.

This does not seem very different from what happens in x86..
quoted
When talking about ordering between the devices, the relevant question
is what happens if the writel(DEVICE_BAR) triggers DEVICE_BAR to DMA
from the DEVICE_FOO. 'ordered' means that in this case
writel(DEVICE_FOO) must be presented to FOO before anything generated
by BAR.
Yes, and that isn't the case for arm because the writes can still be
buffered.
The statement is not about buffering, or temporal completion order, or
the order of acks returning to the CPU. It is about pure transaction
ordering inside the interconnect.

Can write BAR -> FOO pass write CPU -> FOO?

Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help