Re: [External] Re: [PATCH v2] mm: add new syscall pidfd_set_mempolicy().
From: Zhongkun He <hidden>
Date: 2022-11-16 11:44:18
Also in:
linux-doc, linux-mm, lkml
Hi Michal, I've done the performance testing, please check it out.
quoted
Yes this is all understood but the level of the overhead is not really clear. So the question is whether this will induce a visible overhead. Because from the maintainability point of view it is much less costly to have a clear life time model. Right now we have a mix of reference counting and per-task requirements which is rather subtle and easy to get wrong. In an ideal world we would have get_vma_policy always returning a reference counted policy or NULL. If we really need to optimize for cache line bouncing we can go with per cpu reference counters (something that was not available at the time the mempolicy code has been introduced). So I am not saying that the task_work based solution is not possible I just think that this looks like a good opportunity to get from the existing subtle model.
Test tools:
numactl -m 0-3 ./run-mmtests.sh -n -c configs/config-workload-
aim9-pagealloc test_name
Modification:
Get_vma_policy(), get_task_policy() always returning a reference
counted policy, except for the static policy(default_policy and
preferred_node_policy[nid]).
All vma manipulation is protected by a down_read, so mpol_get()
can be called directly to take a refcount on the mpol. but there
is no lock in task->mempolicy context.
so task->mempolicy should be protected by task_lock.
struct mempolicy *get_task_policy(struct task_struct *p)
{
struct mempolicy *pol;
int node;
if (p->mempolicy) {
task_lock(p);
pol = p->mempolicy;
mpol_get(pol);
task_unlock(p);
if (pol)
return pol;
}
.....
}
Test Case1:
Describe:
Test directly, no other user processes.
Result:
This will degrade performance about 1% to 3%.
For more information, please see the attachment:mpol.txt
aim9
Hmean page_test 484561.68 ( 0.00%) 471039.34 * -2.79%*
Hmean brk_test 1400702.48 ( 0.00%) 1388949.10 * -0.84%*
Hmean exec_test 2339.45 ( 0.00%) 2278.41 * -2.61%*
Hmean fork_test 6500.02 ( 0.00%) 6500.17 * 0.00%*
Test Case2:
Describe:
Added a user process, top.
Result:
This will degrade performance about 2.1%.
For more information, please see the attachment:mpol_top.txt
Hmean page_test 477916.47 ( 0.00%) 467829.01 * -2.11%*
Hmean brk_test 1351439.76 ( 0.00%) 1373663.90 * 1.64%*
Hmean exec_test 2312.24 ( 0.00%) 2296.06 * -0.70%*
Hmean fork_test 6483.46 ( 0.00%) 6472.06 * -0.18%*
Test Case3:
Describe:
Add a daemon to read /proc/$test_pid/status, which will acquire
task_lock. while :;do cat /proc/$(pidof singleuser)/status;done
Result:
the baseline is degrade from 484561(case1) to 438591(about 10%)
when the daemon was add, but the performance degradation in case3 is
about 3.2%. For more information, please see the
attachment:mpol_status.txt
Hmean page_test 438591.97 ( 0.00%) 424251.22 * -3.27%*
Hmean brk_test 1268906.57 ( 0.00%) 1278100.12 * 0.72%*
Hmean exec_test 2301.19 ( 0.00%) 2192.71 * -4.71%*
Hmean fork_test 6453.24 ( 0.00%) 6090.48 * -5.62%*
Thanks,
Zhongkun. Attachments
- mpol.txt [text/plain] 14076 bytes · preview
- mpol_status.txt [text/plain] 7803 bytes · preview
- mpol_top.txt [text/plain] 7731 bytes · preview