Re: [PATCH v2 bpf-next] bpf: add optional memory accounting for maps
From: Alexei Starovoitov <hidden>
Date: 2019-01-31 18:36:10
On Thu, Jan 31, 2019 at 10:38:01AM +0100, Martynas Pumputis wrote:
Previously, memory allocated for a map was not accounted. Therefore,
this memory could not be taken into consideration by the cgroups
memory controller.
This patch introduces the "BPF_F_ACCOUNT_MEM" flag which enables
the memory accounting for a map, and it can be set during
the map creation ("BPF_MAP_CREATE") in "map_flags".
When enabled, we account only that amount of memory which is charged
against the "RLIMIT_MEMLOCK" limit.
To validate the change, first we create the memory cgroup-v1 "test-map":
# mkdir /sys/fs/cgroup/memory/test-map
And then we run the following program against the cgroup:
$ cat test_map.c
<..>
int main() {
usleep(3 * 1000000);
assert(bpf_create_map(BPF_MAP_TYPE_HASH, 8, 16, 65536, 0) > 0);
usleep(3 * 1000000);
}
# cgexec -g memory:test-map ./test_map &
# cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
397312
258048
<after 3 sec the map has been created>
# bpftool map list
19: hash flags 0x0
key 8B value 16B max_entries 65536 memlock 5771264B
# cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
401408
262144
As we can see, the memory allocated for map is not accounted, as
397312B + 5771264B > 401408B.
Next, we enabled the accounting and re-run the test:
$ cat test_map.c
<..>
int main() {
usleep(3 * 1000000);
assert(bpf_create_map(BPF_MAP_TYPE_HASH, 8, 16, 65536, BPF_F_ACCOUNT_MEM) > 0);
usleep(3 * 1000000);
}
# cgexec -g memory:test-map ./test_map &
# cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
450560
307200
<after 3 sec the map has been created>
# bpftool map list
20: hash flags 0x80
key 8B value 16B max_entries 65536 memlock 5771264B
# cat /sys/fs/cgroup/memory/test-map/memory{,.kmem}.usage_in_bytes
6221824
6078464
This time, the memory (including kmem) is accounted, as
450560B + 5771264B <= 6221824B
Acked-by: Yonghong Song <redacted>
Signed-off-by: Martynas Pumputis <redacted>see my reply in other thread.