RE: [PATCH v2 0/3] lib/string: optimized mem* functions

From: David Laight <hidden>
Date: 2021-07-12 09:04:23
Also in: linux-riscv, lkml

From: Matteo Croce

Sent: 11 July 2021 00:08

On Sat, Jul 10, 2021 at 11:31 PM Andrew Morton
[off-list ref] wrote:

quoted

On Fri,  2 Jul 2021 14:31:50 +0200 Matteo Croce [off-list ref] wrote:

quoted

From: Matteo Croce <redacted>

Rewrite the generic mem{cpy,move,set} so that memory is accessed with
the widest size possible, but without doing unaligned accesses.

This was originally posted as C string functions for RISC-V[1], but as
there was no specific RISC-V code, it was proposed for the generic
lib/string.c implementation.

Tested on RISC-V and on x86_64 by undefining __HAVE_ARCH_MEM{CPY,SET,MOVE}
and HAVE_EFFICIENT_UNALIGNED_ACCESS.

These are the performances of memcpy() and memset() of a RISC-V machine
on a 32 mbyte buffer:

memcpy:
original aligned:      75 Mb/s
original unaligned:    75 Mb/s
new aligned:          114 Mb/s
new unaligned:                107 Mb/s

memset:
original aligned:     140 Mb/s
original unaligned:   140 Mb/s
new aligned:          241 Mb/s
new unaligned:                241 Mb/s

Did you record the x86_64 performance?


Which other architectures are affected by this change?

x86_64 won't use these functions because it defines __HAVE_ARCH_MEMCPY
and has optimized implementations in arch/x86/lib.
Anyway, I was curious and I tested them on x86_64 too, there was zero
gain over the generic ones.

x86 performance (and attainable performance) does depend on the cpu
micro-archiecture.

Any recent 'desktop' intel cpu will almost certainly manage to
re-order the execution of almost any copy loop and attain 1 write per clock.
(Even the trivial 'while (count--) *dest++ = *src++;' loop.)

The same isn't true of the Atom based cpu that may be on small servers.
Theses are no slouches (eg 4 cores at 2.4GHz) but only have limited
out-of-order execution and so are much more sensitive to instruction
ordering.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help