Re: [PATCH v2 2/3] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision
From: Simon Guo <hidden>
Date: 2017-09-28 05:26:16
On Wed, Sep 27, 2017 at 09:43:44AM +0000, David Laight wrote:
From: Segher Boessenkoolquoted
Sent: 27 September 2017 10:28...quoted
You also need nasty code to deal with the start and end of strings, with conditional branches and whatnot, which quickly overwhelms the benefit of using vector registers at all. This tradeoff also changes with newer ISA versions.The goal posts keep moving. For instance with modern intel x86 cpus 'rep movsb' is by far the fastest way to copy data (from cached memory).quoted
Things have to become *really* cheap before it will be good to often use vector registers in the kernel though.I've had thoughts about this in the past. If the vector registers belong to the current process then you might get away with just saving the ones you want to use. If they belong to a different process then you also need to tell the FPU save code where you've saved the registers. Then the IPI code can recover all the correct values. On X86 all the AVX registers are caller saved, the system call entry could issue the instruction that invalidates them all. Kernel code running in the context of a user process could then use the registers without saving them. It would only need to set a mark to ensure they are invalidated again on return to user (might be cheap enough to do anyway). Dunno about PPC though.
I am not aware of any ppc instruction which can set a "mark" or provide any high granularity flag against single or subgroup of vec regs' validity. But ppc experts may want to correct me. Thanks, - Simon