Thread (10 messages) 10 messages, 4 authors, 2021-09-19

Re: [PATCH] riscv: use the generic string routines

From: Palmer Dabbelt <palmer@dabbelt.com>
Date: 2021-08-04 20:40:22
Also in: linux-riscv, lkml

On Tue, 03 Aug 2021 09:54:34 PDT (-0700), mcroce@linux.microsoft.com wrote:
On Mon, Jul 19, 2021 at 1:44 PM Matteo Croce [off-list ref] wrote:
quoted
From: Matteo Croce <redacted>

Use the generic routines which handle alignment properly.

These are the performances measured on a BeagleV machine for a
32 mbyte buffer:

memcpy:
original aligned:        75 Mb/s
original unaligned:      75 Mb/s
new aligned:            114 Mb/s
new unaligned:          107 Mb/s

memset:
original aligned:       140 Mb/s
original unaligned:     140 Mb/s
new aligned:            241 Mb/s
new unaligned:          241 Mb/s

TCP throughput with iperf3 gives a similar improvement as well.

This is the binary size increase according to bloat-o-meter:

add/remove: 0/0 grow/shrink: 4/2 up/down: 432/-36 (396)
Function                                     old     new   delta
memcpy                                        36     324    +288
memset                                        32     148    +116
strlcpy                                      116     132     +16
strscpy_pad                                   84      96     +12
strlcat                                      176     164     -12
memmove                                       76      52     -24
Total: Before=1225371, After=1225767, chg +0.03%

Signed-off-by: Matteo Croce <redacted>
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
---
Hi,

can someone have a look at this change and share opinions?
This LGTM.  How are the generic string routines landing?  I'm happy to 
take this into my for-next, but IIUC we need the optimized generic 
versions first so we don't have a performance regression falling back to 
the trivial ones for a bit.  Is there a shared tag I can pull in?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help