Re: [PATCH 3/4] RISC-V: Fix L1_CACHE_BYTES for RV32
From: Geert Uytterhoeven <geert@linux-m68k.org>
Date: 2021-01-17 18:56:25
Also in:
lkml
Hi Atish, On Sat, Jan 16, 2021 at 2:39 AM Atish Patra [off-list ref] wrote:
On Thu, Jan 14, 2021 at 11:59 PM Geert Uytterhoeven [off-list ref] wrote:quoted
On Thu, Jan 14, 2021 at 10:11 PM Atish Patra [off-list ref] wrote:quoted
On Thu, Jan 14, 2021 at 11:46 AM Palmer Dabbelt [off-list ref] wrote:quoted
On Thu, 14 Jan 2021 10:33:01 PST (-0800), atishp@atishpatra.org wrote:quoted
On Wed, Jan 13, 2021 at 9:10 PM Palmer Dabbelt [off-list ref] wrote:quoted
On Thu, 07 Jan 2021 01:26:51 PST (-0800), Atish Patra wrote:quoted
SMP_CACHE_BYTES/L1_CACHE_BYTES should be defined as 32 instead of 64 for RV32. Otherwise, there will be hole of 32 bytes with each memblock allocation if it is requested to be aligned with SMP_CACHE_BYTES. Signed-off-by: Atish Patra <redacted> --- arch/riscv/include/asm/cache.h | 4 ++++ 1 file changed, 4 insertions(+)diff --git a/arch/riscv/include/asm/cache.h b/arch/riscv/include/asm/cache.h index 9b58b104559e..c9c669ea2fe6 100644 --- a/arch/riscv/include/asm/cache.h +++ b/arch/riscv/include/asm/cache.h@@ -7,7 +7,11 @@ #ifndef _ASM_RISCV_CACHE_H #define _ASM_RISCV_CACHE_H +#ifdef CONFIG_64BIT #define L1_CACHE_SHIFT 6 +#else +#define L1_CACHE_SHIFT 5 +#endif #define L1_CACHE_BYTES (1 << L1_CACHE_SHIFT)Should we not instead just #define SMP_CACHE_BYTES L1_CACHE_BYTES like a handful of architectures do?The generic code already defines it that way in include/linux/cache.hquoted
The cache size is sort of fake here, as we don't have any non-coherent mechanisms, but IIRC we wrote somewhere that it's recommended to have 64-byte cache lines in RISC-V implementations as software may assume that for performance reasons. Not really a strong reason, but I'd prefer to just make these match.If it is documented somewhere in the kernel, we should update that. I think SMP_CACHE_BYTES being 64 actually degrades the performance as there will be a fragmented memory blocks with 32 bit bytes gap wherever SMP_CACHE_BYTES is used as an alignment requirement.I don't buy that: if you're trying to align to the cache size then the gaps are the whole point. IIUC the 64-byte cache lines come from DDR, not XLEN, so there's really no reason for these to be different between the base ISAs.Got your point. I noticed this when fixing the resource tree issue where the SMP_CACHE_BYTES alignment was not intentional but causing the issue. The real issue was solved via another patch in this series though. Just to clarify, if the allocation function intends to allocate consecutive memory, it should use 32 instead of SMP_CACHE_BYTES. This will lead to a #ifdef macro in the code.quoted
quoted
In addition to that, Geert Uytterhoeven mentioned some panic on vex32 without this patch. I didn't see anything in Qemu though.Something like that is probably only going to show up on real hardware, QEMU doesn't really do anything with the cache line size. That said, as there's nothing in our kernel now related to non-coherent memory there really should only be performance issue (at least until we have non-coherent systems). I'd bet that the change is just masking some other bug, either in the software or the hardware. I'd prefer to root cause this rather than just working around it, as it'll probably come back later and in a more difficult way to find.Agreed. @Geert Uytterhoeven Can you do a further analysis of the panic you were saying ? We may need to change an alignment requirement to 32 for RV32 manually at some place in code.My findings were in https://lore.kernel.org/linux-riscv/CAMuHMdWf6K-5y02+WJ6Khu1cD6P0n5x1wYQikrECkuNtAA1pgg@mail.gmail.com/ (local) Note that when the memblock.reserved list kept increasing, it kept on adding the same entry to the list. But that was fixed by "[PATCH 1/4] RISC-V: Do not allocate memblock while iterating reserved memblocks". After that, only the (reproducible) "Unable to handle kernel paging request at virtual address 61636473" was left, always at the same place. No idea where the actual corruption happened.Yes. I was asking about this panic. I don't have the litex fpga to reproduce this as well. Can you take a look at the epc & ra to figure out where exactly is the fault ? That will help to understand the real cause for this panic.
[...]
Freeing initrd memory: 8192K
workingset: timestamp_bits=30 max_order=15 bucket_order=0
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
io scheduler mq-deadline registered
io scheduler kyber registered
Unable to handle kernel paging request at virtual address 61636473
Oops [#1]
CPU: 0 PID: 1 Comm: swapper/0 Not tainted
5.11.0-rc3-orangecrab-00068-g267ecb2e2e9d-dirty #37
epc: c000f8e4 ra : c00110e8 sp : c082bc70
gp : c0665948 tp : c0830000 t0 : c08ba500
t1 : 00000002 t2 : 00000000 s0 : c082bc80
s1 : 00000000 a0 : c05e2dec a1 : c08ba4e0
a2 : c0665d38 a3 : 61636473 a4 : f0004003
a5 : f0004000 a6 : c7efffb8 a7 : c08ba4e0
s2 : 01001f00 s3 : c0666000 s4 : c05e2e0c
s5 : c0666000 s6 : 80000000 s7 : 00000006
s8 : c05a4000 s9 : c08ba4e0 s10: c05e2dec
s11: 00000000 t3 : c08ba500 t4 : 00000001
t5 : 00076400 t6 : c08bbb5e
status: 00000120 badaddr: 61636473 cause: 0000000d
---[ end trace 50524759df172195 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
---[ end Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0000000b ]---
Looking up epc and ra in System.map, closest symbols are:
c000f8b0 t __request_resource
c0010ff4 T __request_region
The above is with a kernel built from my own config, but using
litex_vexriscv_defconfig with https://github.com/geertu/linux branch
litex-v5.11 and commit 718aaf7d1c351035 ("RISC-V: Fix L1_CACHE_BYTES for
RV32") reverted gives the exact same results.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv