Thread (8 messages) 8 messages, 4 authors, 2017-11-15

Re: [PATCH] memcg: hugetlbfs basic usage accounting

From: Roman Gushchin <hidden>
Date: 2017-11-15 11:19:02
Also in: linux-mm, lkml

On Wed, Nov 15, 2017 at 09:35:04AM +0100, Michal Hocko wrote:
On Tue 14-11-17 17:24:29, Roman Gushchin wrote:
quoted
This patch implements basic accounting of memory consumption
by hugetlbfs pages for cgroup v2 memory controller.

Cgroup v2 memory controller lacks any visibility into the
hugetlbfs memory consumption. Cgroup v1 implemented a separate
hugetlbfs controller, which provided such stats, and also
provided some control abilities. Although porting of the
hugetlbfs controller to cgroup v2 is arguable a good idea and
is outside of scope of this patch, it's very useful to have
basic stats provided by memory.stat.
Hi, Michal!
Separate hugetlb cgroup controller was really a deliberate decision.
We didn't want to mix hugetlb with the reclaimable memory. There is no
reasonable way to enforce memcg limits if hugetlb pages are involved.

AFAICS your patch shouldn't break the hugetlb controller because that
one (ab)uses page[2].private to store the hstate for the accounting.
You also do not really charge those hugetlb pages so the memcg
accounting will work unchaged.
Yes, you are right.
So my primary question is, why don't you simply allow hugetlb controller
rather than tweak stats for memcg? Is there any fundamental reason why
hugetlb controller is not v2 compatible?
I really don't know if the hugetlb controller has enough users to deserve
full support in v2 interface: adding knobs like memory.hugetlb.current,
memory.hugetlb.min, memory.hugetlb.high, memory.hugetlb.max, etc.

I'd be rather conservative here and avoid adding a lot to the interface
without clear demand. Also, hugetlb pages are really special, and it's
at least not obvious how, say, memory.high should work for it.

At the same time we don't really have any accounting of hugetlb page
usage (except system-wide stats in sysfs). And providing such stats
is really useful.
In my particular case, I have some number of pre-allocated hugepages,
and I have several containerized workloads, which are potentially
using them to get performance bonuses. Having these stats allows to
attribute the memory holding by hugetlb pages to one of the workloads.
It feels really strange to keeps stats of something the controller
doesn't really control. I can imagine confused users claiming that
numbers just do not add up...
This is why I do not add this number to memory.current. At the same
time numbers in memory.stat are not intended to be summed (we have
event counters there, dirty pages counter, etc), so I don't see a problem.

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help