Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file
From: Guo Ren <guoren@kernel.org>
Date: 2019-06-23 16:35:52
Also in:
kvmarm, linux-riscv, lkml
Thx Catalin, On Fri, Jun 21, 2019 at 10:16 PM Catalin Marinas [off-list ref] wrote:
On Wed, Jun 19, 2019 at 07:51:03PM +0800, Guo Ren wrote:quoted
On Wed, Jun 19, 2019 at 4:54 PM Julien Grall [off-list ref] wrote:quoted
On 6/19/19 9:07 AM, Guo Ren wrote:quoted
Move arm asid allocator code in a generic one is a agood idea, I've made a patchset for C-SKY and test is on processing, See: https://lore.kernel.org/linux-csky/1560930553-26502-1-git-send-email-guoren@kernel.org/ (local) If you plan to seperate it into generic one, I could co-work with you.Was the ASID allocator work out of box on C-Sky?Almost done, but one question: arm64 remove the code in switch_mm: cpumask_clear_cpu(cpu, mm_cpumask(prev)); cpumask_set_cpu(cpu, mm_cpumask(next)); Why? Although arm64 cache operations could affect all harts with CTC method of interconnect, I think we should keep these code for primitive integrity in linux. Because cpu_bitmap is in mm_struct instead of mm->context.We didn't have a use for this in the arm64 code, so no point in maintaining the mm_cpumask. On some arm32 systems (ARMv6) with no hardware broadcast of some TLB/cache operations, we use it to track where the task has run to issue IPI for TLB invalidation or some deferred I-cache invalidation.
The operation of set/clear mm_cpumask was removed in arm64 compared to arm32. It seems no side effect on current arm64 system, but from software meaning it's wrong. I think we should keep mm_cpumask just like arm32.
(there was also a potential optimisation on arm64 to avoid broadcast TLBI if the task only ran on a single CPU but Will found that was rarely the case on an SMP system because of rebalancing happening during execve(), ending up with two bits set in the mm_cpumask) The way you use it on csky is different from how it is done on arm. It seems to clear the mask for the scheduled out (prev) task but this wouldn't work on arm(64) since the TLB still contains prev entries tagged with the scheduled out ASID. Whether it matters, I guess it depends on the specifics of your hardware.
Sorry for the mistake quote, what I mean is what is done in arm32: clear all bits of mm->cpu_mask in new_context(), and set back one by one. Here is my patch: https://lore.kernel.org/linux-csky/CAJF2gTQ0xQtQY1t-g9FgWaxfDXppMkFooCQzTFy7+ouwUfyA6w@mail.gmail.com/T/#m2ed464d2dfb45ac6f5547fb3936adf2da456cb65 (local)
While the algorithm may seem fairly generic, the semantics have a few corner cases specific to each architecture. See [1] for a description of the semantics we need on arm64 (CnP is a feature where the hardware threads of the same core can share the TLB; the original algorithm violated the requirements when this feature was enabled).
C-SKY SMP is only one hart per core, but here is a patch [1] with my thought on SMT duplicate tlb flush: [1] https://lore.kernel.org/linux-csky/1561305869-18872-1-git-send-email-guoren@kernel.org/T/#u (local) For TLA+ model, I still need some learning before I could talk with you.
BTW, if you find the algorithm fairly straightforward ;), see this
bug-fix which took a formal model to identify: a8ffaaa060b8 ("arm64:
asid: Do not replace active_asids if already 0").I think it's one fo the cases that other archs also could get benefit from arm's asid allocator code. Btw, Is this detected by arm's aisd allocator TLA+ model ? Or a real bug report ? -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/ _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel