Re: [RFC] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault
From: Nadav Amit <hidden>
Date: 2021-03-31 16:37:10
Also in:
lkml
Attachments
- signature.asc [application/pgp-signature] 833 bytes
From: Nadav Amit <hidden>
Date: 2021-03-31 16:37:10
Also in:
lkml
On Mar 31, 2021, at 6:16 AM, Mel Gorman [off-list ref] wrote: On Wed, Mar 31, 2021 at 07:20:09PM +0800, Huang, Ying wrote:quoted
Mel Gorman [off-list ref] writes:quoted
On Mon, Mar 29, 2021 at 02:26:51PM +0800, Huang Ying wrote:quoted
For NUMA balancing, in hint page fault handler, the faulting page will be migrated to the accessing node if necessary. During the migration, TLB will be shot down on all CPUs that the process has run on recently. Because in the hint page fault handler, the PTE will be made accessible before the migration is tried. The overhead of TLB shooting down is high, so it's better to be avoided if possible. In fact, if we delay mapping the page in PTE until migration, that can be avoided. This is what this patch doing.Why would the overhead be high? It was previously inaccessibly so it's only parallel accesses making forward progress that trigger the need for a flush.Sorry, I don't understand this. Although the page is inaccessible, the threads may access other pages, so TLB flushing is still necessary.You assert the overhead of TLB shootdown is high and yes, it can be very high but you also said "the benchmark score has no visible changes" indicating the TLB shootdown cost is not a major problem for the workload. It does not mean we should ignore it though.
If you are looking for a benchmark that is negatively affected by NUMA balancing, then IIRC Parsec’s dedup is such a workload. [1] [1] https://parsec.cs.princeton.edu/