Re: [RFC PATCH 0/5] madvise MADV_DOEXEC
From: Matthew Wilcox <willy@infradead.org>
Date: 2021-08-16 14:29:27
Also in:
lkml
On Mon, Aug 16, 2021 at 04:10:28PM +0200, David Hildenbrand wrote:
quoted
quoted
quoted
Until recently, the CPUs only having 4 1GB TLB entries. I'm sure we still have customers using that generation of CPUs. 2MB pages perform better than 1GB pages on the previous generation of hardware, and I haven't seen numbers for the next generation yet.I read that somewhere else before, yet we have heavy 1 GiB page users, especially in the context of VMs and DPDK.I wonder if those users actually benchmarked. Or whether the memory savings worked out so well for them that the loss of TLB performance didn't matter.These applications are extremely performance sensitive (i.e., RT workloads),
"real time does not mean real fast". it means predictable latency.
quoted
quoted
I will rephrase my previous statement "hugetlbfs just doesn't raise these problems because we are special casing it all over the place already". For example, not allowing to swap such pages. Disallowing MADV_DONTNEED. Special hugetlbfs locking.Sure, that's why I want to drag this feature out of "oh this is a hugetlb special case" and into "this is something Linux supports".I would have understood the move to optimize SHMEM internally - similar to how we seem to optimize hugetlbfs SHMEM right now internally. (although sharing page tables for shmem can still be quite tricky) I did not follow why we have to play games with MAP_PRIVATE, and having private anonymous pages shared between processes that don't COW, introducing new syscalls etc.
It's not about SHMEM, it's about file-backed pages on regular filesystems. I don't want to have XFS, ext4 and btrfs all with their own implementations of ARCH_WANT_HUGE_PMD_SHARE.