Re: [PATCH v6 bpf-next 0/5] bpf_prog_pack followup
From: Song Liu <hidden>
Date: 2022-07-09 01:14:33
Also in:
bpf, linux-mm, lkml
On Jul 8, 2022, at 3:24 PM, Luis Chamberlain [off-list ref] wrote: On Fri, Jul 08, 2022 at 07:58:44PM +0000, Song Liu wrote:quoted
quoted
On Jul 8, 2022, at 8:58 AM, Luis Chamberlain [off-list ref] wrote: On Fri, Jul 08, 2022 at 01:36:25AM +0000, Song Liu wrote:quoted
quoted
On Jul 7, 2022, at 5:53 PM, Luis Chamberlain [off-list ref] wrote: On Thu, Jul 07, 2022 at 11:52:58PM +0000, Song Liu wrote:quoted
quoted
On Jul 7, 2022, at 3:59 PM, Luis Chamberlain [off-list ref] wrote: On Thu, Jul 07, 2022 at 03:35:41PM -0700, Song Liu wrote:quoted
This set is the second half of v4 [1]. Changes v5 => v6: 1. Rebase and extend CC list.Why post a new iteration so soon without completing the discussion we had? It seems like we were at least going somewhere. If it's just to include mm as I requested, sure, that's fine, but this does not provide context as to what we last were talking about.Sorry for sending v6 too soon. The primary reason was to extend the CC list and add it back to patchwork (v5 somehow got archived). Also, I think vmalloc_exec_ work would be a separate project, while this set is the followup work of bpf_prog_pack. Does this make sense? Btw, vmalloc_exec_ work could be a good topic for LPC. It will be much more efficient to discuss this in person.What we need is input from mm / arch folks. What is not done here is what that stuff we're talking about is and so mm folks can't guess. My preference is to address that. I don't think in person discussion is needed if the only folks discussing this topic so far is just you and me.How about we start a thread with mm / arch folks for the vmalloc_exec_* topic? I will summarize previous discussions and include pointers to these discussions. If necessary, we can continue the discussion at LPC.This sounds like a nice thread to use as this is why we are talking about that topic.quoted
OTOH, I guess the outcome of that discussion should not change this set?If the above is done right then actually I think it would show similar considerations for a respective free for module_alloc_huge().quoted
If we have concern about module_alloc_huge(), maybe we can have bpf code call vmalloc directly (until we have vmalloc_exec_)?You'd need to then still open code in a similar way the same things which we are trying to reach consensus on.quoted
quoted
What do you think about this plan?I think we should strive to not be lazy and sloppy, and prevent growth of sloppy code. So long as we do that I think this is all reasoanble.Let me try to understand your concerns here. Say if we want module code to be a temporary home for module_alloc_huge before we move it to mm code. Would you think it is ready to ship if we:Please CC Christoph and linux-modules@vger.kernel.org on future patches and dicussions aroudn this, and all others now CC'd.
Sometimes, vger drops my patch because the CC list is too long. That's the reason I often trim the CC list. I will try to keep folks in this thread CC'ed.
quoted
1) Rename module_alloc_huge as module_alloc_text_huge();module_alloc_text_huge() is too long, but I've suggested names before which are short and generic, and also suggested that if modules are not the only users this needs to go outside of modules and so vmalloc_text_huge() or whatever. To do this right it begs the question why we don't do that for the existing module_alloc(), as the users of this code is well outside of modules now. Last time a similar generic name was used all the special arch stuff was left to be done by the module code still, but still non-modules were still using that allocator. From my perspective the right thing to do is to deal with all the arch stuff as well in the generic handler, and have the module code *and* the other users which use module_alloc() to use that new caller as well.
The key difference between module_alloc() and the new API is that the API will return RO+X memory, and the user need text-poke like API to modify this buffer. Archs that do not support text-poke will not be able to use the new API. Does this sound like a reasonable design?
quoted
2) Add module_free_text_huge();Right, we have special handling for how we free this special code for regular module_alloc() and so similar considerations would be needed here for the huge stuff.quoted
3) Move set_memory_* and fill_ill_insn logic into module_alloc_text_huge() and module_free_text_huge().Yes, that's a bit hairy now, and so a saner and consistent way to do this would be best.
Thanks for these information. I will try to go this direction.
quoted
Are these on the right direction? Did I miss anything important?I've also hinted before that another way to help here is to have draw up a simple lib/test_vmalloc_text.c or something like that which would enable a selftest to ensure correctness of this code on different archs and maybe even let you do performance analysis using perf [0]. You have good reasons to move to the huge allocator and the performance metrics are an abstract load, however perf measurements can also give you real raw data which you can reproduce and enable others to do similar comparisons later. The last thing I'd ask is just ensure you Cc folks who have already been in these discussions. [0] https://lkml.kernel.org/r/Yog+d+oR5TtPp2cs@bombadil.infradead.org
Let me see how we can test it. Thanks, Song