Thread (54 messages) 54 messages, 9 authors, 2018-12-07

Re: [PATCH 1/2] vmalloc: New flag for flush before releasing pages

From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Date: 2018-12-06 20:19:39
Also in: linux-mm, lkml

On Thu, 2018-12-06 at 11:19 -0800, Andy Lutomirski wrote:
On Thu, Dec 6, 2018 at 11:01 AM Tycho Andersen [off-list ref] wrote:
quoted
On Thu, Dec 06, 2018 at 10:53:50AM -0800, Andy Lutomirski wrote:
quoted
quoted
If we are going to unmap the linear alias, why not do it at vmalloc()
time rather than vfree() time?
That’s not totally nuts. Do we ever have code that expects __va() to
work on module data?  Perhaps crypto code trying to encrypt static
data because our APIs don’t understand virtual addresses.  I guess if
highmem is ever used for modules, then we should be fine.

RO instead of not present might be safer.  But I do like the idea of
renaming Rick's flag to something like VM_XPFO or VM_NO_DIRECT_MAP and
making it do all of this.
Yeah, doing it for everything automatically seemed like it was/is
going to be a lot of work to debug all the corner cases where things
expect memory to be mapped but don't explicitly say it. And in
particular, the XPFO series only does it for user memory, whereas an
additional flag like this would work for extra paranoid allocations
of kernel memory too.
I just read the code, and I looks like vmalloc() is already using
highmem (__GFP_HIGH) if available, so, on big x86_32 systems, for
example, we already don't have modules in the direct map.

So I say we go for it.  This should be quite simple to implement --
the pageattr code already has almost all the needed logic on x86.  The
only arch support we should need is a pair of functions to remove a
vmalloc address range from the address map (if it was present in the
first place) and a function to put it back.  On x86, this should only
be a few lines of code.

What do you all think?  This should solve most of the problems we have.

If we really wanted to optimize this, we'd make it so that
module_alloc() allocates memory the normal way, then, later on, we
call some function that, all at once, removes the memory from the
direct map and applies the right permissions to the vmalloc alias (or
just makes the vmalloc alias not-present so we can add permissions
later without flushing), and flushes the TLB.  And we arrange for
vunmap to zap the vmalloc range, then put the memory back into the
direct map, then free the pages back to the page allocator, with the
flush in the appropriate place.

I don't see why the page allocator needs to know about any of this.
It's already okay with the permissions being changed out from under it
on x86, and it seems fine.  Rick, do you want to give some variant of
this a try?
Hi,

Sorry, I've been having email troubles today.

I found some cases where vmap with PAGE_KERNEL_RO happens, which would not set
NP/RO in the directmap, so it would be sort of inconsistent whether the
directmap of vmalloc range allocations were readable or not. I couldn't see any
places where it would cause problems today though.

I was ready to assume that all TLBs don't cache NP, because I don't know how
usages where a page fault is used to load something could work without lots of
flushes. If that's the case, then all archs with directmap permissions could
share a single vmalloc special permission flush implementation that works like
Andy described originally. It could be controlled with an
ARCH_HAS_DIRECT_MAP_PERMS. We would just need something like set_pages_np and
set_pages_rw on any archs with directmap permissions. So seems simpler to me
(and what I have been doing) unless I'm missing the problem.

If you all think so I can indeed take a shot at it, I just don't see what the
problem was with the original solution, that seems less likely to break
anything.

Thanks,

Rick
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help