Re: [RFC PATCH 3/3] mm/migrate: Create move_phys_pages syscall
From: Gregory Price <hidden>
Date: 2023-09-11 22:25:03
Also in:
linux-arch, linux-cxl, lkml
On Mon, Sep 11, 2023 at 07:26:45PM +0200, Arnd Bergmann wrote:
quoted hunk ↗ jump to hunk
On Sun, Sep 10, 2023, at 14:52, Gregory Price wrote:quoted
I'll clean up the current implementation for what I have on a v2 of an RFC, and then look at adding some pull-ahead patches to fix both move_pages and move_phys_pages for compat processes. Might take me a bit, I've only done compat work once before and I remember it being annoying to get right.I think what you want is roughly this (untested):--- a/mm/migrate.c +++ b/mm/migrate.c@@ -2159,6 +2159,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, const int __user *nodes, int __user *status, int flags) { + struct compat_uptr_t __user *compat_pages = (void __user *)pages; int current_node = NUMA_NO_NODE; LIST_HEAD(pagelist); int start, i;@@ -2171,8 +2172,17 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, int node; err = -EFAULT; - if (get_user(p, pages + i)) - goto out_flush; + if (in_compat_syscall() { + compat_uptr_t cp; + + if (get_user(cp, compat_pages + i)) + goto out_flush; + + p = compat_ptr(cp); + } else { + if (get_user(p, pages + i)) + goto out_flush; + } if (get_user(node, nodes + i)) goto out_flush;alternatively you could use the get_compat_pages_array() helper that is already used in the do_pages_stat() function.
Appreciated, i'll give it a hack before i submit V2. Just to be clear, it sounds like you want move_pages to be converted from (const __user * __user *pages) to (const __u64 __user *pages) as well, correct? That seems like a fairly trivial change.
quoted
This only requires plumbing new 2 flags through do_pages_move, and no new user-exposed types or information. Is there an ick-factor with the idea of adding the following? MPOL_MF_PHYS_ADDR : Treat page migration addresses as physical MPOL_MF_PFN : Treat page migration addresses as PFNsI would strongly prefer supporting only one of the two, and a 64-bit physical address seems like the logical choice here. I agree that this doesn't introduce any additional risk for rowhammer attacks, but it seems slightly more logical to me to use CAP_SYS_ADMIN if that is what the other interfaces use that handle physical addresses and may leak address information. Arnd
Fair enough, I'll swap to ADMIN and limit to phys_addr. I suppose I could add /sys/kernel/mm/page_size accessible only by root for the same purpose, so that PFNs from idle and such can be useful. I don't know of another way for userland to determine the shift. ~Gregory