RE: [PATCH 3/3] powerpc: bpf: implement in-register swap for 64-bit endian operations
From: David Laight <hidden>
Date: 2017-01-24 16:14:01
Also in:
netdev
From: David Laight <hidden>
Date: 2017-01-24 16:14:01
Also in:
netdev
From: 'Naveen N. Rao'
Sent: 23 January 2017 19:22 On 2017/01/15 09:00AM, Benjamin Herrenschmidt wrote:quoted
On Fri, 2017-01-13 at 23:22 +0530, 'Naveen N. Rao' wrote:quoted
quoted
That rather depends on whether the processor has a store to load fo=
rwarder
quoted
quoted
quoted
that will satisfy the read from the store buffer. I don't know about ppc, but at least some x86 will do that.Interesting - good to know that. However, I don't think powerpc does that and in-register swap is like=
ly
quoted
quoted
faster regardless. Note also that gcc prefers this form at higher optimization levels.Of course powerpc has a load-store forwarder these days, however, I wouldn't be surprised if the in-register form was still faster on some implementations, but this needs to be tested.=20 Thanks for clarifying! To test this, I wrote a simple (perhaps naive) test that just issues a whole lot of endian swaps and in _that_ test, it does look like the load-store forwarder is doing pretty well.
...
This is all in a POWER8 vm. On POWER7, the in-register variant is around 4 times faster than the ldbrx variant.
... I wonder which is faster on the little 1GHz embedded ppc we use here. David