Re: [PATCH] cache: Workaround HiSilicon Taishan DC CVAU
From: chenweilong <hidden>
Date: 2021-12-29 03:11:54
Also in:
lkml
On 2021/12/14 2:56, Will Deacon wrote:
On Fri, Nov 26, 2021 at 05:11:39PM +0800, Weilong Chen wrote:quoted
Taishan's L1/L2 cache is inclusive, and the data is consistent. Any change of L1 does not require DC operation to brush CL in L1 to L2. It's safe that don't clean data cache by address to point of unification. Without IDC featrue, kernel needs to flush icache as well as dcache, causes performance degradation. The flaw refers to V110/V200 variant 1. Signed-off-by: Weilong Chen <redacted> --- Documentation/arm64/silicon-errata.rst | 2 ++ arch/arm64/Kconfig | 11 +++++++++ arch/arm64/include/asm/cputype.h | 2 ++ arch/arm64/kernel/cpu_errata.c | 32 ++++++++++++++++++++++++++ arch/arm64/tools/cpucaps | 1 + 5 files changed, 48 insertions(+)Hmm. We don't usually apply optimisations for specific CPUs on arm64, simply because the diversity of CPUs out there means it quickly becomes a fragmented mess. Is this patch purely a performance improvement? If so, please can you provide some numbers in an attempt to justify it?
Yes,it's a performance improvement. I have a test program like this:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/time.h>
int main()
{
void *tmp;
int len = 200 * 1024 * 1024;
struct timeval start, end;
int interval;
tmp = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if(tmp == MAP_FAILED) {
perror("mmap failed");
exit(errno);
}
memset(tmp, 0, len);
gettimeofday(&start, NULL);
if(mprotect(tmp, len, PROT_READ|PROT_EXEC)) {
perror("Couldn’t mprotect");
exit(errno);
}
gettimeofday(&end, NULL);
interval = 1000000*(end.tv_sec - start.tv_sec) + (end.tv_usec - start.tv_usec);
printf("interval = %fms\n", interval/1000.0);
}
Without this fix, the mprotect takes:
interval = 25.608000ms
And with this fix:
interval = 0.689000ms
Have better performance improvement.
If you think it is suitable, I will send a v2 patch as the original patch broken cpu hotplug checks.
Thanks, Will .