Re: [PATCH v2 2/3] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision
From: Michael Ellerman <mpe@ellerman.id.au>
Date: 2017-09-27 03:38:10
Segher Boessenkool [off-list ref] writes:
On Tue, Sep 26, 2017 at 03:34:36PM +1000, Michael Ellerman wrote:quoted
Cyril Bur [off-list ref] writes:quoted
This was written for userspace which doesn't have to explicitly enable VMX in order to use it - we need to be smarter in the kernel.Well the kernel has to do it for them after a trap, which is actually even more expensive, so arguably the glibc code should be smarter too and the threshold before using VMX should probably be higher than in the kernel (to cover the cost of the trap).A lot of userspace code uses V*X, more and more with newer CPUs and newer compiler versions. If you already paid the price for using vector registers you do not need to again :-)
True, but you don't know if you've paid the price already. You also pay the price on every context switch (more state to switch), so it's not free even once enabled. Which is why the kernel will eventually turn it off if it's unused again. But now that I've actually looked at the glibc version, it does do some checks for minimum length before doing any vector instructions, so that's probably all we want. The exact trade off between checking some bytes without vector vs turning on vector depends on your input data, so it's tricky to tune in general.
quoted
But I digress :)Yeah sorry :-)
:) cheers