Re: [RFC][PATCH 0/3] page cgroup diet
From: Konstantin Khlebnikov <hidden>
Date: 2012-03-21 06:13:40
Also in:
linux-mm
KAMEZAWA Hiroyuki wrote:
(2012/03/20 4:59), Konstantin Khlebnikov wrote:quoted
KAMEZAWA Hiroyuki wrote:quoted
This is just an RFC...test is not enough yet. I know it's merge window..this post is just for sharing idea. This patch merges pc->flags and pc->mem_cgroup into a word. Then, memcg's overhead will be 8bytes per page(4096bytes?). Because this patch will affect all memory cgroup developers, I'd like to show patches before MM Summit. I think we can agree the direction to reduce size of page_cgroup..and finally integrate into 'struct page' (and remove cgroup_disable= boot option...) Patch 1/3 - introduce pc_to_mem_cgroup and hide pc->mem_cgroup Patch 2/3 - remove pc->mem_cgroup Patch 3/3 - remove memory barriers. I'm now wondering when this change should be merged....This is cool, but maybe we should skip this temporary step and merge all this stuff into page->flags.Why we should skip and delay reduction of size of page_cgroup which is considered as very big problem ?
I think it would be better to solve problem completely and kill page_cgroup in one step.
quoted
I think we can replace zone-id and node-id in page->flags with cumulative dynamically allocated lruvec-id, so there will be enough space for hundred cgroups even on 32-bit systems.Where section-id is ? IIUC, now, page->section->zone/node is calculated if CONFIG_SPARSEMEM.
Yeah, sections are biggest problem there. I hope we can unravel this knot. In the worst case we can extend page->flags upto 64-bits.
BTW, I doubt that we can modify page->flags dynamically with multi-bit operations...using cmpxchg per each page when it's charged/uncharged/other ?
we can do atomic_xor(&page->flags, new-lruvec-id ^ old-lruvec-id) or atomic_add(&page->flags, new-lruvec-id - old-lruvec-id) they should work faster than cmpxchg
quoted
After lru_lock splitting page to lruvec translation will be much frequently used than page to zone, so page->zone and page->node translations can be implemented as page->lruvec->zone and page->lruvec->node.And need to take rcu_read_lock() around page_zone() ?
Hmm, it depends. For kernel-pages there will be pointer to root-lruvec, so no protection required. If we hold lru_lock we also don't need this rcu_read_lock.
Thanks, -Kame