Re: [PATCH] mm/migrate: fix page state accounting type conversion underflow
From: Matthew Wilcox <willy@infradead.org>
Date: 2021-07-27 16:20:31
On Thu, Jul 22, 2021 at 03:48:40PM +1000, Nicholas Piggin wrote:
quoted hunk ↗ jump to hunk
Similarly to commit 2da9f6305f306 ("mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit"), fix -ve int -> unsigned int -> long bug. Reported-by: Alexey Kardashevskiy <redacted> Fixes: c5fc5c3ae0c84 ("mm: migrate: account THP NUMA migration counters correctly") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)diff --git a/mm/migrate.c b/mm/migrate.c index 34a9ad3e0a4f..7e240437e7d9 100644 --- a/mm/migrate.c +++ b/mm/migrate.c@@ -2068,7 +2068,7 @@ int migrate_misplaced_page(struct page *page, struct vm_area_struct *vma, LIST_HEAD(migratepages); new_page_t *new; bool compound; - unsigned int nr_pages = thp_nr_pages(page); + int nr_pages = thp_nr_pages(page);
This has prompted me to go off and look through the folio work. There are a number of similar problems. It's sad that gcc doesn't have a warning that catched this (although Clang does!) I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101645 In the meantime, I'm going to change: - folio_nr_pages() now returns 'long' instead of 'unsigned long'. - Everywhere that assigns it to a variable changes its type from 'int' or 'unsigned int' or 'unsigned long' to 'long'. The only place this looks a little dodgy is: +static inline bool folio_contains(struct folio *folio, pgoff_t index) +{ + /* HugeTLBfs indexes the page cache in units of hpage_size */ + if (folio_test_hugetlb(folio)) + return folio->index == index; + return index - folio_index(folio) < folio_nr_pages(folio); +} but i'm pretty sure that's OK because index & folio_index are both unsigned long, and by my reading of the C spec, that promotes long to unsigned long, and so we do an unsigned comparison (meaning that index 3, folio_index() 4 will wrap around to ULONG_MAX and the comparison will return false, as expected. I also intend to change update_lru_size() to take a long. This patch is insufficient ... mem_cgroup_move_account() has the same bug. I'm not quite sure how to handle this patch -- I'm going to replace all this in 5.15, but this should be backported to earlier kernels.