Thread (18 messages) 18 messages, 5 authors, 2017-09-28

Re: [PATCH v2 2/3] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision

From: Michael Ellerman <mpe@ellerman.id.au>
Date: 2017-09-26 05:34:36

Cyril Bur [off-list ref] writes:
On Sun, 2017-09-24 at 05:18 +0800, Simon Guo wrote:
quoted
Hi Cyril,
On Sat, Sep 23, 2017 at 12:06:48AM +1000, Cyril Bur wrote:
quoted
On Thu, 2017-09-21 at 07:34 +0800, wei.guo.simon@gmail.com wrote:
quoted
From: Simon Guo <redacted>

This patch add VMX primitives to do memcmp() in case the compare size
exceeds 4K bytes.
Sorry I didn't see this sooner, I've actually been working on a kernel
version of glibc commit dec4a7105e (powerpc: Improve memcmp performance
for POWER8) unfortunately I've been distracted and it still isn't done.
Thanks for sync with me. Let's consolidate our effort together :)

I have a quick check on glibc commit dec4a7105e. 
Looks the aligned case comparison with VSX is launched without rN size
limitation, which means it will have a VSX reg load penalty even when the 
length is 9 bytes.
This was written for userspace which doesn't have to explicitly enable
VMX in order to use it - we need to be smarter in the kernel.
Well the kernel has to do it for them after a trap, which is actually
even more expensive, so arguably the glibc code should be smarter too
and the threshold before using VMX should probably be higher than in the
kernel (to cover the cost of the trap).

But I digress :)

cheers
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help