Re: [PATCH 10/10] powerpc: remove address space overrides using set_fs()
From: Christophe Leroy <hidden>
Date: 2020-09-03 07:21:02
Also in:
linux-arch, linux-fsdevel, lkml
Le 02/09/2020 à 20:02, Linus Torvalds a écrit :
On Wed, Sep 2, 2020 at 8:17 AM Christophe Leroy [off-list ref] wrote:quoted
With this fix, I get root@vgoippro:~# time dd if=/dev/zero of=/dev/null count=1M 536870912 bytes (512.0MB) copied, 6.776327 seconds, 75.6MB/s That's still far from the 91.7MB/s I get with 5.9-rc2, but better than the 65.8MB/s I got yesterday with your series. Still some way to go thought.I don't see why this change would make any difference.
Neither do I. Looks like nowadays, CONFIG_STACKPROTECTOR has become a default. I rebuilt the kernel without it, I now get a throughput of 99.8MB/s both without and with this series. Looking at the generated code (GCC 10.1), a small change in a function seems to make large changes in the generated code when CONFIG_STACKPROTECTOR is set. In addition to that, trivial functions which don't use the stack at all get a stack frame anyway when CONFIG_STACKPROTECTOR is set, allthough that's only -fstack-protector-strong. And there is no canary check. Without CONFIG_STACKPROTECTOR: c01572a0 <no_llseek>: c01572a0: 38 60 ff ff li r3,-1 c01572a4: 38 80 ff e3 li r4,-29 c01572a8: 4e 80 00 20 blr With CONFIG_STACKPROTECTOR (regardless of CONFIG_STACKPROTECTOR_STRONG or not): c0164e08 <no_llseek>: c0164e08: 94 21 ff f0 stwu r1,-16(r1) c0164e0c: 38 60 ff ff li r3,-1 c0164e10: 38 80 ff e3 li r4,-29 c0164e14: 38 21 00 10 addi r1,r1,16 c0164e18: 4e 80 00 20 blr Wondering why CONFIG_STACKPROTECTOR has become the default. It seems to imply a 10% performance loss even in the best case (91.7MB/s versus 99.8MB/s) Note that without CONFIG_STACKPROTECTOR_STRONG, I'm at 99.3MB/s, so that's really the _STRONG alternative that hurts. Christophe