Re: CONFIG_ARCH_SUPPORTS_INT128: Why not mips, s390, powerpc, and alpha?
From: Segher Boessenkool <hidden>
Date: 2019-03-30 06:06:24
Also in:
linux-alpha, linux-mips
Hi! On Fri, Mar 29, 2019 at 01:07:07PM +0000, George Spelvin wrote:
I was working on some scaling code that can benefit from 64x64->128-bit multiplies. GCC supports an __int128 type on processors with hardware support (including z/Arch and MIPS64), but the support was broken on early compilers, so it's gated behind CONFIG_ARCH_SUPPORTS_INT128. Currently, of the ten 64-bit architectures Linux supports, that's only enabled on x86, ARM, and RISC-V. SPARC and HP-PA don't have support. But that leaves Alpha, Mips, PowerPC, and S/390x. Current mips64, powerpc64, and s390x gcc seems to generate sensible code for mul_u64_u64_shr() in <linux/math64.h> if I cross-compile them.
Yup.
I don't have easy access to an Alpha cross-compiler to test, but as it has UMULH, I suspect it would work, too.
https://mirrors.edge.kernel.org/pub/tools/crosstool/
u64 get_random_u64(void);
u64 get_random_max64(u64 range, u64 lim)
{
unsigned __int128 prod;
do {
prod = (unsigned __int128)get_random_u64() * range;
} while (unlikely((u64)prod < lim));
return prod >> 64;
}Which turns into these inner loops: MIPS: .L7: jal get_random_u64 nop dmultu $2,$17 mflo $3 sltu $4,$3,$16 bne $4,$0,.L7 mfhi $2 PowerPC: .L9: bl get_random_u64 nop mulld 9,3,31 mulhdu 3,3,31 cmpld 7,30,9 bgt 7,.L9 s/390: .L13: brasl %r14,get_random_u64@PLT lgr %r5,%r2 mlgr %r4,%r10 lgr %r2,%r4 clgr %r11,%r5 jh .L13 I like that the MIPS code leaves the high half of the product in the hi register until it tests the low half; I wish PowerPC would similarly move the mulhdu *after* the loop,
The MIPS code has the multiplication inside the loop as well, and even the mfhi I think: MIPS has delay slots. GCC treats the int128 as one register until it has expanded to RTL, and it does not do such loop optimisations after that, apparently. File a PR please? https://gcc.gnu.org/bugzilla/ Segher