Re: [External] Re: [PATCH v2] mm: add new syscall pidfd_set_mempolicy().

From: Zhongkun He <hidden>
Date: 2022-11-16 11:44:18
Also in: linux-doc, linux-mm, lkml

Hi Michal, I've done the performance testing, please check it out.

quoted

Yes this is all understood but the level of the overhead is not really
clear. So the question is whether this will induce a visible overhead.
Because from the maintainability point of view it is much less costly to
have a clear life time model. Right now we have a mix of reference
counting and per-task requirements which is rather subtle and easy to
get wrong. In an ideal world we would have get_vma_policy always
returning a reference counted policy or NULL. If we really need to
optimize for cache line bouncing we can go with per cpu reference
counters (something that was not available at the time the mempolicy
code has been introduced).

So I am not saying that the task_work based solution is not possible I
just think that this looks like a good opportunity to get from the
existing subtle model.

Test tools:
numactl -m 0-3 ./run-mmtests.sh -n -c configs/config-workload-
aim9-pagealloc  test_name

Modification:
Get_vma_policy(), get_task_policy() always returning a reference
counted policy, except for the static policy(default_policy and
preferred_node_policy[nid]).

All vma manipulation is protected by a down_read, so mpol_get()
can be called directly to take a refcount on the mpol. but there
is no lock in task->mempolicy context.
so task->mempolicy should be protected by task_lock.

struct mempolicy *get_task_policy(struct task_struct *p)
{
	struct mempolicy *pol;
	int node;

	if (p->mempolicy) {
		task_lock(p);
		pol = p->mempolicy;
		mpol_get(pol);
		task_unlock(p);
		if (pol)
			return pol;
	}
	.....
}

Test Case1:
Describe:
	Test directly, no other user processes.
Result:
	This will degrade performance about 1% to 3%.
For more information, please see the attachment:mpol.txt

aim9

Hmean     page_test   484561.68 (   0.00%)   471039.34 *  -2.79%*
Hmean     brk_test   1400702.48 (   0.00%)  1388949.10 *  -0.84%*
Hmean     exec_test     2339.45 (   0.00%)     2278.41 *  -2.61%*
Hmean     fork_test     6500.02 (   0.00%)     6500.17 *   0.00%*



Test Case2:
Describe:
	Added a user process, top.
Result:
	This will degrade performance about 2.1%.
For more information, please see the attachment:mpol_top.txt

Hmean     page_test   477916.47 (   0.00%)   467829.01 *  -2.11%*
Hmean     brk_test   1351439.76 (   0.00%)  1373663.90 *   1.64%*
Hmean     exec_test     2312.24 (   0.00%)     2296.06 *  -0.70%*
Hmean     fork_test     6483.46 (   0.00%)     6472.06 *  -0.18%*


Test Case3:
	
Describe:
	Add a daemon to read /proc/$test_pid/status, which will acquire 
task_lock. while :;do cat /proc/$(pidof singleuser)/status;done

Result:
	the baseline is degrade from 484561(case1) to 438591(about 10%)
when the daemon was add, but the performance degradation in case3 is
about 3.2%. For more information, please see the
attachment:mpol_status.txt

Hmean     page_test   438591.97 (   0.00%)   424251.22 *  -3.27%*
Hmean     brk_test   1268906.57 (   0.00%)  1278100.12 *   0.72%*
Hmean     exec_test     2301.19 (   0.00%)     2192.71 *  -4.71%*
Hmean     fork_test     6453.24 (   0.00%)     6090.48 *  -5.62%*


Thanks,
Zhongkun.

Attachments

mpol.txt [text/plain] 14076 bytes · preview
mpol_status.txt [text/plain] 7803 bytes · preview
mpol_top.txt [text/plain] 7731 bytes · preview

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help