Thread (26 messages) 26 messages, 4 authors, 2023-11-28

Re: [RFC PATCH 00/11] mm/mempolicy: Make task->mempolicy externally modifiable via syscall and procfs

From: Michal Hocko <mhocko@suse.com>
Date: 2023-11-28 09:45:08
Also in: linux-arch, linux-doc, linux-fsdevel, linux-mm, lkml

On Mon 27-11-23 11:14:44, Gregory Price wrote:
On Mon, Nov 27, 2023 at 04:29:56PM +0100, Michal Hocko wrote:
quoted
Sorry, didn't have much time to do a proper review. Couple of points
here at least.
quoted
So... yeah... the is one area I think the community very much needs to
comment:  set/get_mempolicy2, many new mempolicy syscalls, procfs? All
of the above?
I think we should actively avoid using proc interface. The most
reasonable way would be to add get_mempolicy2 interface that would allow
extensions and then create a pidfd counterpart to allow acting on a
remote task. The latter would require some changes to make mempolicy
code less current oriented.
Sounds good, I'll pull my get/set_mempolicy2 RFC on top of this.

Just context: patches 1-6 refactor mempolicy to allow remote task
twiddling (fixing the current-oriented issues), and patch 7 adds the pidfd
interfaces you describe above.


Couple Questions

1) Should we consider simply adding a pidfd arg to set/get_mempolicy2,
   where if (pidfd == 0), then it operates on current, otherwise it
   operates on the target task?  That would mitigate the need for what
   amounts to the exact same interface.
This wouldn't fit into existing pidfd interfaces I am aware of. We
assume pidfd to be real fd, no special cases.
2) Should we combine all the existing operations into set_mempolicy2 and
   add an operation arg.

   set_mempolicy2(pidfd, arg_struct, len)

   struct {
     int pidfd; /* optional */
     int operation; /* describe which op_args to use */
     union {
       struct {
       } set_mempolicy;
       struct {
       } set_vma_home_node;
       struct {
       } mbind;
       ...
     } op_args;
   } args;

   capturing:
     sys_set_mempolicy
     sys_set_mempolicy_home_node
     sys_mbind

   or should we just make a separate interface for mbind/home_node to
   limit complexity of the single syscall?
My preference would be to go with specific syscalls. Multiplexing
syscalls have turned much more complex and less flexible over time.
Just have a look at futex.
-- 
Michal Hocko
SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help