Thread (25 messages) 25 messages, 6 authors, 2022-01-19

Re: [PATCH v2 3/3] x86: Support huge vmalloc mappings

From: Christophe Leroy <hidden>
Date: 2022-01-15 10:17:52
Also in: linux-arm-kernel, linux-doc, linux-mm, lkml


Le 29/12/2021 à 12:01, Kefeng Wang a écrit :
On 2021/12/29 0:14, Dave Hansen wrote:
quoted
On 12/28/21 2:26 AM, Kefeng Wang wrote:
quoted
quoted
quoted
There are some disadvantages about this feature[2], one of the main
concerns is the possible memory fragmentation/waste in some scenarios,
also archs must ensure that any arch specific vmalloc allocations that
require PAGE_SIZE mappings(eg, module alloc with STRICT_MODULE_RWX)
use the VM_NO_HUGE_VMAP flag to inhibit larger mappings.
That just says that x86 *needs* PAGE_SIZE allocations.  But, what
happens if VM_NO_HUGE_VMAP is not passed (like it was in v1)?  Will the
subsequent permission changes just fragment the 2M mapping?
Yes, without VM_NO_HUGE_VMAP, it could fragment the 2M mapping.

When module alloc with STRICT_MODULE_RWX on x86, it calls
__change_page_attr()

from set_memory_ro/rw/nx which will split large page, so there is no
need to make

module alloc with HUGE_VMALLOC.
This all sounds very fragile to me.  Every time a new architecture would
get added for huge vmalloc() support, the developer needs to know to go
find that architecture's module_alloc() and add this flag.  They next
guy is going to forget, just like you did.

Considering that this is not a hot path, a weak function would be a nice
choice:

/* vmalloc() flags used for all module allocations. */
unsigned long __weak arch_module_vm_flags()
{
    /*
     * Modules use a single, large vmalloc().  Different
     * permissions are applied later and will fragment
     * huge mappings.  Avoid using huge pages for modules.
     */
    return VM_NO_HUGE_VMAP;
For x86, it only fragment, but for arm64, due to apply_to_page_range() in

set_memory_*, without this flag maybe crash. Whatever, we need this

flag for module.
I see no reason to have this flag by default.

Only ARM should have it if necessary, with a comment explaining why just 
like powerpc.

And maybe the flag should be there only when STRICT_MODULE_RWX is selected.
quoted
}

Stick that in some the common module code, next to:
quoted
void * __weak module_alloc(unsigned long size)
{
         return __vmalloc_node_range(size, 1, VMALLOC_START, 
VMALLOC_END,
...

Then, put arch_module_vm_flags() in *all* of the module_alloc()
implementations, including the generic one.  That way (even with a new
architecture) whoever copies-and-pastes their module_alloc()
implementation is likely to get it right.  The next guy who just does a
"select HAVE_ARCH_HUGE_VMALLOC" will hopefully just work.
OK, Let me check the VM_FLUSH_RESET_PERMS and try about this way.

Thanks.
quoted
VM_FLUSH_RESET_PERMS could probably be dealt with in the same way.
.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help