Re: [dpdk-dev] How to disable SVE auto vectorization while using GCC
From: Honnappa Nagarahalli <hidden>
Date: 2021-05-08 19:18:10
<snip>
quoted
On Fri, Apr 30, 2021 at 5:27 PM fengchengwen[off-list ref] wrote:quoted
quoted
Hi, ALL We have a question for your help: 1. We have two platforms, both of which are ARM64, one of whichsupportsquoted
quoted
both NEON and SVE, the other only support NEON. 2. We want to run on both platforms with a single binary file, and use the highest vector capability of the corresponding platform wheneverpossible.quoted
I see VPP has a similar feature. IMO, it is not present in DPDK. Basically, In order to do this. - Compile slow-path code(90% of DPDK) with minimal CPU instruction set support - Have fastpath function compile with different CPU instruction set levels -In slowpath, Attach the fastpath function pointer-based on CPU instruction-level support.quoted
3. So we build the DPDK program with -march=armv8-a+sve+crc (GCC10.2).quoted
quoted
However, it is found that invalid instructions occur when the program runs on a machine that does not support SVE (pls see below). 4. The problem is caused by the introduction of SVE in GCC automaticvectorquoted
quoted
optimization. So Is there a way to disable GCC automatic vector optimization or useonlyquoted
quoted
NEON to perform automatic vector optimization? BTW: we already test -fno-tree-vectorize (as link below) but found noeffect.quoted
quoted
https://stackoverflow.com/questions/7778174/how-can-i-disable-vector ization-while-using-gcc The GDB output: EAL: Detected 128 lcore(s) EAL: Detected 4 NUMA nodes Option -w, --pci-whitelist is deprecated, use -a, --allow option instead Program received signal SIGILL, Illegal instruction. 0x0000000000671b88 in eal_adjust_config () (gdb) (gdb) where #0 0x0000000000671b88 in eal_adjust_config () #1 0x0000000000682840 in rte_eal_init () #2 0x000000000051c870 in main () (gdb) The disassembly output of eal_adjust_config: 671b7c: f8237a81 str x1, [x20, x3, lsl #3] 671b80: f110001f cmp x0, #0x400 671b84: 54ffff21 b.ne 671b68 <eal_adjust_config+0x1f4> //b.anyquoted
quoted
671b88: 043357f5 addvl x21, x19, #-1 671b8c: 043457e1 addvl x1, x20, #-1 671b90: 910562b5 add x21, x21, #0x158 671b94: 04e0e3e0 cntd x0 671b98: 914012b5 add x21, x21, #0x4, lsl #12 671b9c: 52800218 mov w24, #0x10 // #16 671ba0: 25d8e3e1 ptrue p1.d 671ba4: 25f80fe0 whilelo p0.d, wzr, w24 671ba8: a5e04020 ld1d {z0.d}, p0/z, [x1, x0, lsl #3] Best regards.Is there a way to use Gcc function multiversioning for this? https://gcc.gnu.org/onlinedocs/gcc/Function-Multiversioning.html Not sure if this is only available on all compiler versions that DPDK claims to support. It looks like it made into GCC 6 and LLVM 7
It looks like it is not fully support for Arm. For ex: 'target_clones' is not supported or automatic dispatcher does not seem to be supported, we need to write our own dispatcher. The following code works and should be sufficient for DPDK. There is no need to pass SVE flag at the command line for the compiler. I do not have a machine with SVE, so the SVE part is not tested.
#include <stdio.h>
#include <sys/auxv.h>
__attribute__((target ("arch=armv8-a+crc")))
int foo_neon ()
{
printf ("Neon\n");
return 1;
}
__attribute__((target ("arch=armv8-a+sve")))
int foo_sve ()
{
printf ("SVE\n");
return 2;
}
/*
* The following code can go into IO function selection in DPDK during
* initialization.
*/
void
foo_selector ()
{
static int(*foo)(void);
if (!foo)
/* The following code can use DPDK wrappers */
foo = getauxval(AT_HWCAP) & HWCAP_SVE ? foo_sve : foo_neon;
foo ();
}
int main ()
{
foo_selector();
return 0;
}