Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd
From: Fuad Tabba <hidden>
Date: 2022-10-03 08:35:07
Also in:
kvm, linux-doc, linux-fsdevel, linux-mm, lkml, qemu-devel
Hi On Fri, Sep 30, 2022 at 5:23 PM Kirill A . Shutemov [off-list ref] wrote:
On Fri, Sep 30, 2022 at 05:14:00PM +0100, Fuad Tabba wrote:quoted
Hi, <...>quoted
diff --git a/mm/memfd_inaccessible.c b/mm/memfd_inaccessible.c new file mode 100644 index 000000000000..2d33cbdd9282 --- /dev/null +++ b/mm/memfd_inaccessible.c@@ -0,0 +1,219 @@ +// SPDX-License-Identifier: GPL-2.0 +#include "linux/sbitmap.h" +#include <linux/memfd.h> +#include <linux/pagemap.h> +#include <linux/pseudo_fs.h> +#include <linux/shmem_fs.h> +#include <uapi/linux/falloc.h> +#include <uapi/linux/magic.h> + +struct inaccessible_data { + struct mutex lock; + struct file *memfd; + struct list_head notifiers; +}; + +static void inaccessible_notifier_invalidate(struct inaccessible_data *data, + pgoff_t start, pgoff_t end) +{ + struct inaccessible_notifier *notifier; + + mutex_lock(&data->lock); + list_for_each_entry(notifier, &data->notifiers, list) { + notifier->ops->invalidate(notifier, start, end); + } + mutex_unlock(&data->lock); +} + +static int inaccessible_release(struct inode *inode, struct file *file) +{ + struct inaccessible_data *data = inode->i_mapping->private_data; + + fput(data->memfd); + kfree(data); + return 0; +} + +static long inaccessible_fallocate(struct file *file, int mode, + loff_t offset, loff_t len) +{ + struct inaccessible_data *data = file->f_mapping->private_data; + struct file *memfd = data->memfd; + int ret; + + if (mode & FALLOC_FL_PUNCH_HOLE) { + if (!PAGE_ALIGNED(offset) || !PAGE_ALIGNED(len)) + return -EINVAL; + } + + ret = memfd->f_op->fallocate(memfd, mode, offset, len);I think that shmem_file_operations.fallocate is only set if CONFIG_TMPFS is enabled (shmem.c). Should there be a check at initialization that fallocate is set, or maybe a config dependency, or can we count on it always being enabled?It is already there: config MEMFD_CREATE def_bool TMPFS || HUGETLBFS And we reject inaccessible memfd_create() for HUGETLBFS. But if we go with a separate syscall, yes, we need the dependency.
I missed that, thanks.
quoted
quoted
+ inaccessible_notifier_invalidate(data, offset, offset + len); + return ret; +} +<...>quoted
+void inaccessible_register_notifier(struct file *file, + struct inaccessible_notifier *notifier) +{ + struct inaccessible_data *data = file->f_mapping->private_data; + + mutex_lock(&data->lock); + list_add(¬ifier->list, &data->notifiers); + mutex_unlock(&data->lock); +} +EXPORT_SYMBOL_GPL(inaccessible_register_notifier);If the memfd wasn't marked as inaccessible, or more generally speaking, if the file isn't a memfd_inaccessible file, this ends up accessing an uninitialized pointer for the notifier list. Should there be a check for that here, and have this function return an error if that's not the case?I think it is "don't do that" category. inaccessible_register_notifier() caller has to know what file it operates on, no?
The thing is, you could oops the kernel from userspace. For that, all you have to do is a memfd_create without the MFD_INACCESSIBLE, followed by a KVM_SET_USER_MEMORY_REGION using that as the private_fd. I ran into this using my port of this patch series to arm64. Cheers, /fuad
-- Kiryl Shutsemau / Kirill A. Shutemov