Re: [PATCH bpf 0/4] introduce HAVE_ARCH_HUGE_VMALLOC_FLAG for bpf_prog_pack
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Date: 2022-03-31 16:20:27
Also in:
bpf, linux-mm
On Thu, 2022-03-31 at 00:46 +0000, Song Liu wrote:
quoted
On Mar 30, 2022, at 5:04 PM, Edgecombe, Rick P < rick.p.edgecombe@intel.com> wrote: On Wed, 2022-03-30 at 15:56 -0700, Song Liu wrote:quoted
[1]
https://lore.kernel.org/lkml/5bd16e2c06a2df357400556c6ae01bb5d3c5c32a.camel@intel.com/ (local)
quoted
The issues I brought up around VM_FLUSH_RESET_PERMS are not fixed in this series. And I think the solution I proposed is kind of wonky with respect to hibernate. So I think maybe hibernate should be fixed to not impose restrictions on the direct map, so the wonkiness is not needed. But then this "fixes" series becomes quite extensive. I wonder, why not just push the patch 1 here, then re-enable this thing when it is all properly fixed up. It looked like your code could handle the allocation not actually getting large pages.Only shipping patch 1 should eliminate the issues. But that will also reduce the benefit in iTLB efficiency (I don't know by how much yet.)
Yea, it's just a matter of what order/timeline things get done in. This change didn't get enough mm attention ahead of time. Now there are two issues. One where the root cause is not fully clear and one that properly needs a wider fix. Just thinking it could be nice to take some time on it, rather than rush to finish what was already too rushed.
quoted
Another solution that would keep large pages but still need fixing up later: Just don't use VM_FLUSH_RESET_PERMS for now. Call set_memory_nx() and then set_memory_rw() on the module space address before vfree(). This will clean up everything that's needed with respect to direct map permissions. Have vmalloc warn if is sees VM_FLUSH_RESET_PERMS and huge pages together.Do you mean we should remove set_vm_flush_reset_perms() from alloc_new_pack() and do set_memory_nx() and set_memory_rw() before we call vfree() in bpf_prog_pack_free()? If this works, I would prefer we go with this way.
I believe this would work functionally.