Thread (23 messages) 23 messages, 12 authors, 2025-01-14

Re: [PATCH 3/8] x86/mm/pat: Restore large pages after fragmentation

From: Mike Rapoport <rppt@kernel.org>
Date: 2025-01-12 08:55:10
Also in: linux-kselftest, linux-mm, linux-modules, linux-um, live-patching, lkml

Hi Kirill,

On Fri, Jan 10, 2025 at 12:36:59PM +0200, Kirill A. Shutemov wrote:
On Fri, Dec 27, 2024 at 09:28:20AM +0200, Mike Rapoport wrote:
quoted
From: "Kirill A. Shutemov" <redacted>

Change of attributes of the pages may lead to fragmentation of direct
mapping over time and performance degradation as result.

With current code it's one way road: kernel tries to avoid splitting
large pages, but it doesn't restore them back even if page attributes
got compatible again.

Any change to the mapping may potentially allow to restore large page.

Hook up into cpa_flush() path to check if there's any pages to be
recovered in PUD_SIZE range around pages we've just touched.

CPUs don't like[1] to have to have TLB entries of different size for the
same memory, but looks like it's okay as long as these entries have
matching attributes[2]. Therefore it's critical to flush TLB before any
following changes to the mapping.

Note that we already allow for multiple TLB entries of different sizes
for the same memory now in split_large_page() path. It's not a new
situation.

set_memory_4k() provides a way to use 4k pages on purpose. Kernel must
not remap such pages as large. Re-use one of software PTE bits to
indicate such pages.

[1] See Erratum 383 of AMD Family 10h Processors
[2] https://lore.kernel.org/linux-mm/1da1b025-cabc-6f04-bde5-e50830d1ecf0@amd.com/ (local)

[rppt@kernel.org:
 * s/restore/collapse/
 * update formatting per peterz
 * use 'struct ptdesc' instead of 'struct page' for list of page tables to
   be freed
 * try to collapse PMD first and if it succeeds move on to PUD as peterz
   suggested
 * flush TLB twice: for changes done in the original CPA call and after
   collapsing of large pages
]

Link: https://lore.kernel.org/all/20200416213229.19174-1-kirill.shutemov@linux.intel.com (local)
Signed-off-by: Kirill A. Shutemov <redacted>
Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
When I originally attempted this, the patch was dropped because of
performance regressions. Was it addressed somehow?
I didn't realize the patch was dropped because of performance regressions,
so I didn't address it.

Do you remember where did the regressions show up?
 
-- 
  Kiryl Shutsemau / Kirill A. Shutemov
-- 
Sincerely yours,
Mike.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help