Thread (54 messages) 54 messages, 9 authors, 2018-12-07

Re: [PATCH 1/2] vmalloc: New flag for flush before releasing pages

From: Nadav Amit <hidden>
Date: 2018-12-04 22:48:46
Also in: linux-mm, lkml

On Dec 4, 2018, at 11:48 AM, Andy Lutomirski [off-list ref] wrote:

On Tue, Dec 4, 2018 at 11:45 AM Nadav Amit [off-list ref] wrote:
quoted
quoted
On Dec 4, 2018, at 10:56 AM, Andy Lutomirski [off-list ref] wrote:

On Mon, Dec 3, 2018 at 5:43 PM Nadav Amit [off-list ref] wrote:
quoted
quoted
On Nov 27, 2018, at 4:07 PM, Rick Edgecombe [off-list ref] wrote:

Since vfree will lazily flush the TLB, but not lazily free the underlying pages,
it often leaves stale TLB entries to freed pages that could get re-used. This is
undesirable for cases where the memory being freed has special permissions such
as executable.
So I am trying to finish my patch-set for preventing transient W+X mappings
from taking space, by handling kprobes & ftrace that I missed (thanks again for
pointing it out).

But all of the sudden, I don’t understand why we have the problem that this
(your) patch-set deals with at all. We already change the mappings to make
the memory writable before freeing the memory, so why can’t we make it
non-executable at the same time? Actually, why do we make the module memory,
including its data executable before freeing it???
All the code you're looking at is IMO a very awkward and possibly
incorrect of doing what's actually necessary: putting the direct map
the way it wants to be.

Can't we shove this entirely mess into vunmap?  Have a flag (as part
of vmalloc like in Rick's patch or as a flag passed to a vfree variant
directly) that makes the vunmap code that frees the underlying pages
also reset their permissions?

Right now, we muck with set_memory_rw() and set_memory_nx(), which
both have very awkward (and inconsistent with each other!) semantics
when called on vmalloc memory.  And they have their own flushes, which
is inefficient.  Maybe the right solution is for vunmap to remove the
vmap area PTEs, call into a function like set_memory_rw() that resets
the direct maps to their default permissions *without* flushing, and
then to do a single flush for everything.  Or, even better, to cause
the change_page_attr code to do the flush and also to flush the vmap
area all at once so that very small free operations can flush single
pages instead of flushing globally.
Thanks for the explanation. I read it just after I realized that indeed the
whole purpose of this code is to get cpa_process_alias()
update the corresponding direct mapping.

This thing (pageattr.c) indeed seems over-engineered and very unintuitive.
Right now I have a list of patch-sets that I owe, so I don’t have the time
to deal with it.

But, I still think that disable_ro_nx() should not call set_memory_x().
IIUC, this breaks W+X of the direct-mapping which correspond with the module
memory. Does it ever stop being W+X?? I’ll have another look.
Dunno.  I did once chase down a bug where some memory got freed while
it was still read-only, and the results were hilarious and hard to
debug, since the explosion happened long after the buggy code
finished.
This piece of code causes me pain and misery.

So, it turns out that the direct map is *not* changed if you just change
the NX-bit. See change_page_attr_set_clr():

        /* No alias checking for _NX bit modifications */
        checkalias = (pgprot_val(mask_set) | pgprot_val(mask_clr)) != _PAGE_NX;

How many levels of abstraction are broken in the way? What would happen
if somebody tries to change the NX-bit and some other bit in the PTE?
Luckily, I don’t think someone does… at least for now.

So, again, I think the change I proposed makes sense. nios2 does not have
set_memory_x() and it will not be affected.

[ I can add a comment, although I don’t have know if nios2 has an NX bit,
and I don’t find any code that defines PTEs. Actually where is pte_present()
of nios2 being defined? Whatever. ]
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help