Re: [External] Re: [PATCH v2] mm: add new syscall pidfd_set_mempolicy().
From: Zhongkun He <hidden>
Date: 2022-11-15 07:39:13
Also in:
linux-doc, linux-mm, lkml
quoted
quoted
We shouldn't really rely on mmap_sem for this IMO.Yes, We should rely on mmap_sem for vma->vm_policy,but not for process context policy(task->mempolicy).But the caller has no way to know which kind of policy is returned so the locking cannot be conditional on the policy type.
Yes. vma->vm_policy is protected by mmap_sem, which is reliable if we want to add a new apis(pidfd_mbind()) to change the vma->vm_policy specified in pidfd. but not for pidfd_set_mempolicy(task->mempolicy is protected by alloc_lock).
Yes this is all understood but the level of the overhead is not really clear. So the question is whether this will induce a visible overhead.
OK,i will try it.
Because from the maintainability point of view it is much less costly to have a clear life time model. Right now we have a mix of reference counting and per-task requirements which is rather subtle and easy to get wrong. In an ideal world we would have get_vma_policy always returning a reference counted policy or NULL. If we really need to optimize for cache line bouncing we can go with per cpu reference counters (something that was not available at the time the mempolicy code has been introduced). So I am not saying that the task_work based solution is not possible I just think that this looks like a good opportunity to get from the existing subtle model.
OK, i got it. Thanks for your reply and suggestions. Zhongkun.