Thread (101 messages) 101 messages, 19 authors, 2008-06-12

Re: MMIO and gcc re-ordering issue

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: 2008-06-03 22:27:57
Also in: linux-arch, lkml

On Tue, 2008-06-03 at 12:43 -0700, Trent Piepho wrote:
Byte-swapping vs not byte-swapping is not usually what the programmer wants. 
Usually your device's registers are defined as being big-endian or
little-endian and you want whatever is needed to give you that.
Yes, which is why I (and some other archs) have writel_be/readl_be.

The standard writel/readl being LE.

However, the "raw" variants are defined to be native endian, which is of
some use to -some- archs apparently where they have SoC device whose
endianness follow the core.
I believe that on some archs that can be either byte order, some built-in
devices will change their registers to match, and so you want "native endian"
or no swapping for these.  Though that's definitely in the minority.

An accessors that always byte-swaps regardless of the endianness of the host
is never something I've seen a driver want.

IOW, there are four ways one can defined endianness/swapping:
1) Little-endian
2) Big-endian
3) Native-endian aka non-byte-swapping
4) Foreign-endian aka byte-swapping

1 and 2 are by far the most used.  Some code wants 3.  No one wants 4.  Yet
our API is providing 3 & 4, the two which are the least useful.
No, we don't provide 4, it was something unclear with nick.

We provide 1. (writel/readl and __variants), some archs provide 2
(writel_be/readl_be, tho I don't have __variants, I suppose I could),
and everybody provides 3. though in some cases (like us) only in the
form of __variants (ie, non ordered, like __raw_readl/__raw_writel).

Nick's proposal is to plug those gaps, though it's, I believe, missing
the _be variants.
Is it enough to provide only "all or none" for ordering strictness?  For
instance on powerpc, one can get a speedup by dropping strict ordering for IO
vs cacheable memory, but still keeping ordering for IO vs IO and IO vs locks. 
This is much easier to program for than no ordering at all.  In fact, if one
doesn't use coherent DMA, it's basically the same as fully strict ordering.
Ben.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help