Re: [PATCH v6 4/5] KVM: guest_memfd: Enforce NUMA mempolicy using shared policy
From: Shivank Garg <hidden>
Date: 2025-03-05 06:02:53
Also in:
kvm, linux-fsdevel, lkml
On 3/4/2025 10:29 PM, Sean Christopherson wrote:
On Tue, Mar 04, 2025, David Hildenbrand wrote:quoted
On 04.03.25 16:30, Sean Christopherson wrote:quoted
On Tue, Mar 04, 2025, Ackerley Tng wrote:quoted
Vlastimil Babka [off-list ref] writes:quoted
quoted
struct shared_policy should be stored on the inode rather than the file, since the memory policy is a property of the memory (struct inode), rather than a property of how the memory is used for a given VM (struct file).That makes sense. AFAICS shmem also uses inodes to store policy.quoted
When the shared_policy is stored on the inode, intra-host migration [1] will work correctly, since the while the inode will be transferred from one VM (struct kvm) to another, the file (a VM's view/bindings of the memory) will be recreated for the new VM. I'm thinking of having a patch like this [2] to introduce inodes.shmem has it easier by already having inodesquoted
With this, we shouldn't need to pass file pointers instead of inode pointers.Any downsides, besides more work needed? Or is it feasible to do it using files now and convert to inodes later? Feels like something that must have been discussed already, but I don't recall specifics.Here's where Sean described file vs inode: "The inode is effectively the raw underlying physical storage, while the file is the VM's view of that storage." [1]. I guess you're right that for now there is little distinction between file and inode and using file should be feasible, but I feel that this dilutes the original intent.Hmm, and using the file would be actively problematic at some point. One could argue that NUMA policy is property of the VM accessing the memory, i.e. that two VMs mapping the same guest_memfd could want different policies. But in practice, that would allow for conflicting requirements, e.g. different policies in each VM for the same chunk of memory, and would likely lead to surprising behavior due to having to manually do mbind() for every VM/file view.I think that's the same behavior with shmem? I mean, if you have two people asking for different things for the same MAP_SHARE file range, surprises are unavoidable.Yeah, I was specifically thinking of the case where a secondary mapping doesn't do mbind() at all, e.g. could end up effectively polluting guest_memfd with "bad" allocations.
Thank you for the feedback. I agree that storing the policy in the inode is the correct approach, as it aligns with shmem's behavior. I now understand that keeping the policy in file-private data could lead to surprising behavior, especially with multiple VMs mapping the same guest_memfd. The inode-based approach also makes sense from a long-term perspective, especially with upcoming restricted mapping support. I'll pick the Ackerley's patch[1] to add support for gmem inodes. With this patch, it does not seem overly complex to implement to policy storage in inodes. I'll test this approach and submit a revised patch shortly. [1] https://lore.kernel.org/all/d1940d466fc69472c8b6dda95df2e0522b2d8744.1726009989.git.ackerleytng@google.com/ (local) Thanks, Shivank