Re: [PATCH] mm/hugetlb: Use the right pte val for compare in hugetlb_cow
From: Aneesh Kumar K.V <hidden>
Date: 2016-10-19 05:11:43
Also in:
linux-mm, lkml
Andrew Morton [off-list ref] writes:
On Tue, 18 Oct 2016 21:12:45 +0530 "Aneesh Kumar K.V" [off-list ref] wrote:quoted
We cannot use the pte value used in set_pte_at for pte_same comparison, because archs like ppc64, filter/add new pte flag in set_pte_at. Instead fetch the pte value inside hugetlb_cow. We are comparing pte value to make sure the pte didn't change since we dropped the page table lock. hugetlb_cow get called with page table lock held, and we can take a copy of the pte value before we drop the page table lock. With hugetlbfs, we optimize the MAP_PRIVATE write fault path with no previous mapping (huge_pte_none entries), by forcing a cow in the fault path. This avoid take an addition fault to covert a read-only mapping to read/write. Here we were comparing a recently instantiated pte (via set_pte_at) to the pte values from linux page table. As explained above on ppc64 such pte_same check returned wrong result, resulting in us taking an additional fault on ppc64.From my reading this is a minor performance improvement and a -stable backport isn't needed. But it is unclear whether the impact warrants a 4.9 merge.
This patch workaround the issue reported at https://lkml.kernel.org/r/57FF7BB4.1070202@redhat.com The reason for that OOM was a reserve count accounting issue which happens in the error path of hugetlb_cow. Not this patch avoid us taking the error path and hence we don't have the reported OOM. An actual fix for that issue is being worked on by Mike Kravetz.
Please be careful about describing end-user visible impacts when fixing bugs, thanks.
-aneesh