Re: [PATCH -V2] mm: move idle swap cache pages to the tail of LRU after COW
From: Johannes Weiner <hannes@cmpxchg.org>
Date: 2021-05-27 16:53:02
Also in:
lkml
On Thu, May 27, 2021 at 04:49:53PM +0800, Huang Ying wrote:
With commit 09854ba94c6a ("mm: do_wp_page() simplification"), after
COW, the idle swap cache (neither the page nor the corresponding swap
entry is mapped by any process) will be left at the original position
in the LRU list. While it may be in the active list or the head of
the inactive list, so that vmscan may take more overhead or time to
reclaim these actually unused pages.
To help the page reclaiming, in this patch, after COW, the idle swap
cache will be tried to be moved to the tail of the inactive LRU list.
To avoid to introduce much overhead to the hot COW code path, all
locks are acquired with try locking.
To test the patch, we used pmbench memory accessing benchmark with
working-set larger than available memory on a 2-socket Intel server
with a NVMe SSD as swap device. Test results shows that the pmbench
score increases up to 21.8% with the decreased size of swap cache and
swapin throughput.
Signed-off-by: "Huang, Ying" <redacted>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: Andrea Arcangeli <redacted>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Dave Hansen <redacted>
Cc: Tim Chen <redacted>
V2:
- Move trylock_page() to try_to_free_idle_swapcache() per Rik and
Linus' comments.
- Fix PageLRU() checking.
- Fix THP processing.
- Rename the function.
---
include/linux/memcontrol.h | 10 ++++++++++
include/linux/swap.h | 3 +++
mm/memcontrol.c | 12 ++++++++++++
mm/memory.c | 2 ++
mm/swapfile.c | 39 ++++++++++++++++++++++++++++++++++++++
5 files changed, 66 insertions(+)Sorry the discussion fizzled out on the last patch. Let me try to recap this series: on your first submission you directly freed the old page if we copied. Linus was worried about overhead in the COW path that wouldn't pay off in a real workload. Before getting numbers, it was then suggested to move the pages to the tail of the LRU and leaving them to reclaim - which was also met with skepticism. V2 presented the LRU moving version with pmbench numbers that indeed show it pays off. However, much simpler direct freeing produces even better numbers in the same benchmark. We don't have numbers showing if the LRU shuffling would significantly fare better in other workloads. Purely looking at the code: whether we defer or free, we need to lock the page, take the LRU spinlock for this page, and touch the LRU linkage. If we free, we add the swapcache deletion and the page allocator, but it's most likely the percpu-cached fastpath. If we defer, reclaim needs to re-establish information about the page that we already had in the COW context, do another LRU operation, do the swapcache deletion and go through the allocator, but on cold caches. Personally, I'm a bit skeptical the extra code complexity and reclaim overhead in paging workloads will definitely pay off in intermittently paging loads (non-paging wouldn't have swap pages). As far as code goes, the point of 09854ba94c6a (+17, -42) was simplification, and this adds more lines back in another place. In particular it adds another lifetime variant to swap pages which are already somewhat unwieldy. OTOH, freeing is a two-liner reusing the swap unmap code: if (page_copied) free_swap_cache(old_page); Linus, what do you think?