Thread (9 messages) 9 messages, 2 authors, 2022-07-13

Re: [PATCH v6 bpf-next 0/5] bpf_prog_pack followup

From: Song Liu <hidden>
Date: 2022-07-09 01:14:33
Also in: bpf, linux-mm, lkml

On Jul 8, 2022, at 3:24 PM, Luis Chamberlain [off-list ref] wrote:

On Fri, Jul 08, 2022 at 07:58:44PM +0000, Song Liu wrote:
quoted
quoted
On Jul 8, 2022, at 8:58 AM, Luis Chamberlain [off-list ref] wrote:

On Fri, Jul 08, 2022 at 01:36:25AM +0000, Song Liu wrote:
quoted
quoted
On Jul 7, 2022, at 5:53 PM, Luis Chamberlain [off-list ref] wrote:

On Thu, Jul 07, 2022 at 11:52:58PM +0000, Song Liu wrote:
quoted
quoted
On Jul 7, 2022, at 3:59 PM, Luis Chamberlain [off-list ref] wrote:

On Thu, Jul 07, 2022 at 03:35:41PM -0700, Song Liu wrote:
quoted
This set is the second half of v4 [1].

Changes v5 => v6:
1. Rebase and extend CC list.
Why post a new iteration so soon without completing the discussion we
had? It seems like we were at least going somewhere. If it's just
to include mm as I requested, sure, that's fine, but this does not
provide context as to what we last were talking about.
Sorry for sending v6 too soon. The primary reason was to extend the CC
list and add it back to patchwork (v5 somehow got archived). 

Also, I think vmalloc_exec_ work would be a separate project, while this 
set is the followup work of bpf_prog_pack. Does this make sense? 

Btw, vmalloc_exec_ work could be a good topic for LPC. It will be much
more efficient to discuss this in person. 
What we need is input from mm / arch folks. What is not done here is
what that stuff we're talking about is and so mm folks can't guess. My
preference is to address that.

I don't think in person discussion is needed if the only folks
discussing this topic so far is just you and me.
How about we start a thread with mm / arch folks for the vmalloc_exec_*
topic? I will summarize previous discussions and include pointers to 
these discussions. If necessary, we can continue the discussion at LPC.
This sounds like a nice thread to use as this is why we are talking
about that topic.
quoted
OTOH, I guess the outcome of that discussion should not change this set? 
If the above is done right then actually I think it would show similar
considerations for a respective free for module_alloc_huge().
quoted
If we have concern about module_alloc_huge(), maybe we can have bpf code 
call vmalloc directly (until we have vmalloc_exec_)? 
You'd need to then still open code in a similar way the same things
which we are trying to reach consensus on.
quoted
quoted
What do you think about this plan?
I think we should strive to not be lazy and sloppy, and prevent growth
of sloppy code. So long as we do that I think this is all reasoanble.
Let me try to understand your concerns here. Say if we want module code
to be a temporary home for module_alloc_huge before we move it to mm
code. Would you think it is ready to ship if we:
Please CC Christoph and linux-modules@vger.kernel.org on future patches
and dicussions aroudn this, and all others now CC'd.
Sometimes, vger drops my patch because the CC list is too long. That's 
the reason I often trim the CC list. I will try to keep folks in this
thread CC'ed. 
quoted
1) Rename module_alloc_huge as module_alloc_text_huge();
module_alloc_text_huge() is too long, but I've suggested names before
which are short and generic, and also suggested that if modules are
not the only users this needs to go outside of modules and so
vmalloc_text_huge() or whatever.

To do this right it begs the question why we don't do that for the
existing module_alloc(), as the users of this code is well outside of
modules now. Last time a similar generic name was used all the special
arch stuff was left to be done by the module code still, but still
non-modules were still using that allocator. From my perspective the
right thing to do is to deal with all the arch stuff as well in the
generic handler, and have the module code *and* the other users which
use module_alloc() to use that new caller as well.
The key difference between module_alloc() and the new API is that the 
API will return RO+X memory, and the user need text-poke like API to
modify this buffer. Archs that do not support text-poke will not be
able to use the new API. Does this sound like a reasonable design?
quoted
2) Add module_free_text_huge();
Right, we have special handling for how we free this special code for regular
module_alloc() and so similar considerations would be needed here for
the huge stuff.
quoted
3) Move set_memory_* and fill_ill_insn logic into module_alloc_text_huge()
and module_free_text_huge(). 
Yes, that's a bit hairy now, and so a saner and consistent way to do
this would be best.
Thanks for these information. I will try to go this direction. 
quoted
Are these on the right direction? Did I miss anything important?
I've also hinted before that another way to help here is to have draw
up a simple lib/test_vmalloc_text.c or something like that which would
enable a selftest to ensure correctness of this code on different archs
and maybe even let you do performance analysis using perf [0]. You have
good reasons to move to the huge allocator and the performance metrics
are an abstract load, however perf measurements can also give you real
raw data which you can reproduce and enable others to do similar
comparisons later.

The last thing I'd ask is just ensure you Cc folks who have already been in
these discussions.

[0] https://lkml.kernel.org/r/Yog+d+oR5TtPp2cs@bombadil.infradead.org
Let me see how we can test it. 

Thanks,
Song
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help