Thread (13 messages) 13 messages, 3 authors, 2021-06-01

Re: [v3 PATCH 2/3] mm/mempolicy: don't handle MPOL_LOCAL like a fake MPOL_PREFERRED policy

From: Feng Tang <hidden>
Date: 2021-06-01 11:29:27
Also in: lkml

On Tue, Jun 01, 2021 at 10:44:39AM +0200, Michal Hocko wrote:
On Mon 31-05-21 22:05:55, Feng Tang wrote:
quoted
MPOL_LOCAL policy has been setup as a real policy, but it is still
handled like a faked POL_PREFERRED policy with one internal
MPOL_F_LOCAL flag bit set, and there are many places having to
judge the real 'prefer' or the 'local' policy, which are quite
confusing.

In current code, there are 4 cases that MPOL_LOCAL are used:
1. user specifies 'local' policy
2. user specifies 'prefer' policy, but with empty nodemask
3. system 'default' policy is used
4. 'prefer' policy + valid 'preferred' node with MPOL_F_STATIC_NODES
   flag set, and when it is 'rebind' to a nodemask which doesn't
   contains the 'preferred' node, it will perform as 'local' policy

So make 'local' a real policy instead of a fake 'prefer' one, and
kill MPOL_F_LOCAL bit, which can greatly reduce the confusion for
code reading.

For case 4, the logic of mpol_rebind_preferred() is confusing, as
Michal Hocko pointed out:

 "
 I do believe that rebinding preferred policy is just bogus and
 it should be dropped altogether on the ground that a preference
 is a mere hint from userspace where to start the allocation.
 Unless I am missing something cpusets will be always authoritative
 for the final placement. The preferred node just acts as a starting
 point and it should be really preserved when cpusets changes.
 Otherwise we have a very subtle behavior corner cases.
 "
So dump all the tricky transformation between 'prefer' and 'local',
and just record the new nodemask of rebinding.

Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Feng Tang <redacted>
I like this very much! It simplifies a tricky code and also a very
dubious behavior. I would like to hear from others whether there might
be some userspace depending on this obscure behavior though. One never
knows...

Some more notes/questions below

[...]
quoted
@@ -239,25 +240,19 @@ static int mpol_set_nodemask(struct mempolicy *pol,
 		  cpuset_current_mems_allowed, node_states[N_MEMORY]);
 
 	VM_BUG_ON(!nodes);
-	if (pol->mode == MPOL_PREFERRED && nodes_empty(*nodes))
-		nodes = NULL;	/* explicit local allocation */
-	else {
-		if (pol->flags & MPOL_F_RELATIVE_NODES)
-			mpol_relative_nodemask(&nsc->mask2, nodes, &nsc->mask1);
-		else
-			nodes_and(nsc->mask2, *nodes, nsc->mask1);
 
-		if (mpol_store_user_nodemask(pol))
-			pol->w.user_nodemask = *nodes;
-		else
-			pol->w.cpuset_mems_allowed =
-						cpuset_current_mems_allowed;
-	}
+	if (pol->flags & MPOL_F_RELATIVE_NODES)
+		mpol_relative_nodemask(&nsc->mask2, nodes, &nsc->mask1);
+	else
+		nodes_and(nsc->mask2, *nodes, nsc->mask1);
Maybe I've just got lost here but why don't you need to check for the
local policy anymore? mpol_new will take care of the MPOL_PREFERRED &&
nodes_empty special but why do we want/need all this for a local policy
at all?
 
You are right that 'local' policy doesn't need this, it should just
return in the early port of this function, like 'default' policy, which
can remove the useless nop mpol_new_local().
quoted
 
-	if (nodes)
-		ret = mpol_ops[pol->mode].create(pol, &nsc->mask2);
+	if (mpol_store_user_nodemask(pol))
+		pol->w.user_nodemask = *nodes;
 	else
-		ret = mpol_ops[pol->mode].create(pol, NULL);
+		pol->w.cpuset_mems_allowed =
+					cpuset_current_mems_allowed;
please use a single line. This is just harder to read. You will cross
the line limit but readability should be preferred here.
 
Will change.

Thanks,
Feng
[...]

I haven't spotted anything else.
-- 
Michal Hocko
SUSE Labs
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help