[RFC PATCH 0/6] riscv: support EIC770X/JH7110 noncoherent devices with XPbmtUC
From: Bo Gan <hidden>
Date: 2026-03-13 08:46:23
Also in:
linux-riscv
Starfive JH7110 and ESWIN EIC770X both have non cache-coherent peripherals. On JH7110[1], GPU/VOUT/VPU/ISP are routed to the sys port, making them not cache-coherent. On EIC770X, all peripherals are routed to the sys port, and none is cache-coherent. To make drivers work on such platforms, the standard solution is to use Svpbmt and map the DMA buffer as uncacheable. However, neither SoC supports Svpbmt. Instead, they map the system memory twice, as cached and uncached. The uncached alias implicitly applies the uncacheable PMA. To support such platform, a special form of Svpbmt, namely "XPbmtUC" is introduced in this patch. It's a synthetical PTE format where a single bit (UC) is controlling the cacheability and the bit position can be configured at runtime. It is intended to model the physical memory aliasing with minimal effort. On JH7110, it aligns perfectly with the HW, as the aliased UC region happens to be offsetted by 2^34. Thus, configuring the XPbmtUC with bit=32 (PPN is shifted by 2) is all that needs to be done. On EIC770X, the aliased UC region is put to a awkward offset, and given there can be 2 NUMA node (dual-die) with 2 separate memory regions and their UC alias counterpart, we instead ask the firmware to provide a thin-layer hypervisor to re-arrange the memory map. The XPbmtUC will be enabled with bit=38, thus map all UC pages to 2^40 (the upper-half of 2^41), and the underlaying hypervisor will re-map the 2^40+ addresses to the appropriate UC alias regions. (See description in PATCH 1/6) We chose bit 38 (PPN bit 40) to make the 2-stage translation efficient. Hypervisor can utilize Sv39x4 G-stage scheme, and map all pages as 1GB huge page, consuming only the first-level page table (16KB total), and several TLB entries. In practice, it's the firmware/bootloader that configures XPbmtUC through device-tree, based on firmware capabilities, and skip the enablement on stock firmware. This is tested on Hifive Premier P550 with the modified OpenSBI[2]. It runs the host Linux in VS mode, and provide the aforementioned remapping. The performance penalty (if not running KVM in Linux) is minimal, as the CPU is never switched to HS mode. A very slight, unavoidable, slow down is with the external interrupt delivery. Due to the lack of AIA in EIC770X, all device irq now needs to trap to M mode first, before forwarding to VS mode. The overhead of running KVM in such setup is yet unknown, and may well be noticeable, as all HS-qualified instructions will trap to M mode, and there's also the extra cost of flushing G/VS-stage TLBs. I'm analyzing it in parallel. I'm aware there's an ongoing series that Samuel sent for physical memory aliases. I haven't been following too closely, but if you're worried about it touching to many areas, I hope my series can shed some light on the problem. My change is very minimal and local, also fairly easy to remove if we later decide deprecating it down the road. [1] https://github.com/starfive-tech/JH7100_Docs/blob/main/JH7100%20Cache%20Coherence%20V1.0.pdf [2] https://github.com/ganboing/opensbi/tree/eic77x-vspt-physalias-wip Bo Gan (6): riscv: Add a custom, simplified version of Svpbmt "XPbmtUC" riscv: alternatives: support auipc+load pair riscv: apply page table attribute bits for XPbmtUC riscv: select RISCV_ISA_XPBMTUC in STARFIVE and ESWIN SoC riscv: dts: starfive: jh7110: activate XPbmtUC [TESTING-ONLY] riscv: dts: eswin: eic7700: activate XPbmtUC arch/riscv/Kconfig | 12 ++++++++++++ arch/riscv/Kconfig.socs | 2 ++ arch/riscv/boot/dts/eswin/eic7700.dtsi | 1 + arch/riscv/boot/dts/starfive/jh7110.dtsi | 1 + arch/riscv/include/asm/errata_list.h | 17 +++++++++++++++-- arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/include/asm/insn.h | 8 ++++++++ arch/riscv/include/asm/pgtable-64.h | 17 ++++++++++++++++- arch/riscv/kernel/alternative.c | 11 ++++++----- arch/riscv/kernel/cpufeature.c | 8 ++++++++ arch/riscv/mm/pgtable.c | 7 +++++++ 11 files changed, 77 insertions(+), 8 deletions(-) -- 2.34.1