Re: [PATCH bpf-next v2 8/8] bpf: add a selftest for cgroup hierarchical stats collection
From: Yonghong Song <hidden>
Date: 2022-06-29 06:49:00
Also in:
bpf, cgroups, lkml
On 6/28/22 5:09 PM, Yosry Ahmed wrote:
On Tue, Jun 28, 2022 at 12:14 AM Yosry Ahmed [off-list ref] wrote:quoted
On Mon, Jun 27, 2022 at 11:47 PM Yosry Ahmed [off-list ref] wrote:quoted
On Mon, Jun 27, 2022 at 11:14 PM Yonghong Song [off-list ref] wrote:quoted
On 6/10/22 12:44 PM, Yosry Ahmed wrote:quoted
Add a selftest that tests the whole workflow for collecting, aggregating (flushing), and displaying cgroup hierarchical stats. TL;DR: - Whenever reclaim happens, vmscan_start and vmscan_end update per-cgroup percpu readings, and tell rstat which (cgroup, cpu) pairs have updates. - When userspace tries to read the stats, vmscan_dump calls rstat to flush the stats, and outputs the stats in text format to userspace (similar to cgroupfs stats). - rstat calls vmscan_flush once for every (cgroup, cpu) pair that has updates, vmscan_flush aggregates cpu readings and propagates updates to parents. Detailed explanation: - The test loads tracing bpf programs, vmscan_start and vmscan_end, to measure the latency of cgroup reclaim. Per-cgroup ratings are stored in percpu maps for efficiency. When a cgroup reading is updated on a cpu, cgroup_rstat_updated(cgroup, cpu) is called to add the cgroup to the rstat updated tree on that cpu. - A cgroup_iter program, vmscan_dump, is loaded and pinned to a file, for each cgroup. Reading this file invokes the program, which calls cgroup_rstat_flush(cgroup) to ask rstat to propagate the updates for all cpus and cgroups that have updates in this cgroup's subtree. Afterwards, the stats are exposed to the user. vmscan_dump returns 1 to terminate iteration early, so that we only expose stats for one cgroup per read. - An ftrace program, vmscan_flush, is also loaded and attached to bpf_rstat_flush. When rstat flushing is ongoing, vmscan_flush is invoked once for each (cgroup, cpu) pair that has updates. cgroups are popped from the rstat tree in a bottom-up fashion, so calls will always be made for cgroups that have updates before their parents. The program aggregates percpu readings to a total per-cgroup reading, and also propagates them to the parent cgroup. After rstat flushing is over, all cgroups will have correct updated hierarchical readings (including all cpus and all their descendants). Signed-off-by: Yosry Ahmed <redacted>There are a selftest failure with test: get_cgroup_vmscan_delay:PASS:output format 0 nsec get_cgroup_vmscan_delay:PASS:cgroup_id 0 nsec get_cgroup_vmscan_delay:PASS:vmscan_reading 0 nsec get_cgroup_vmscan_delay:PASS:read cgroup_iter 0 nsec get_cgroup_vmscan_delay:PASS:output format 0 nsec get_cgroup_vmscan_delay:PASS:cgroup_id 0 nsec get_cgroup_vmscan_delay:FAIL:vmscan_reading unexpected vmscan_reading: actual 0 <= expected 0 check_vmscan_stats:FAIL:child1_vmscan unexpected child1_vmscan: actual 781874 != expected 382092 check_vmscan_stats:FAIL:child2_vmscan unexpected child2_vmscan: actual -1 != expected -2 check_vmscan_stats:FAIL:test_vmscan unexpected test_vmscan: actual 781874 != expected 781873 check_vmscan_stats:FAIL:root_vmscan unexpected root_vmscan: actual 0 < expected 781874 destroy_progs:PASS:remove cgroup_iter pin 0 nsec destroy_progs:PASS:remove cgroup_iter pin 0 nsec destroy_progs:PASS:remove cgroup_iter pin 0 nsec destroy_progs:PASS:remove cgroup_iter pin 0 nsec destroy_progs:PASS:remove cgroup_iter pin 0 nsec destroy_progs:PASS:remove cgroup_iter pin 0 nsec destroy_progs:PASS:remove cgroup_iter pin 0 nsec destroy_progs:PASS:remove cgroup_iter root pin 0 nsec cleanup_bpffs:PASS:rmdir /sys/fs/bpf/vmscan/ 0 nsec #33 cgroup_hierarchical_stats:FAILThe test is passing on my setup. I am trying to figure out if there is something outside the setup done by the test that can cause the test to fail.I can't reproduce the failure on my machine. It seems like for some reason reclaim is not invoked in one of the test cgroups which results in the expected stats not being there. I have a few suspicions as to what might cause this but I am not sure. If you have the capacity, do you mind re-running the test with the attached diff1.patch? (and maybe diff2.patch if that fails, this will cause OOMs in the test cgroup, you might see some process killed warnings). Thanks!In addition to that, it looks like one of the cgroups has a "0" stat which shouldn't happen unless one of the map update/lookup operations failed, which should log something using bpf_printk. I need to reproduce the test failure to investigate this properly. Did you observe this failure on your machine or in CI? Any instructions on how to reproduce or system setup?
I got "0" as well. get_cgroup_vmscan_delay:FAIL:vmscan_reading unexpected vmscan_reading: actual 0 <= expected 0 check_vmscan_stats:FAIL:child1_vmscan unexpected child1_vmscan: actual 676612 != expected 339142 check_vmscan_stats:FAIL:child2_vmscan unexpected child2_vmscan: actual -1 != expected -2 check_vmscan_stats:FAIL:test_vmscan unexpected test_vmscan: actual 676612 != expected 676611 check_vmscan_stats:FAIL:root_vmscan unexpected root_vmscan: actual 0 < expected 676612 I don't have special config. I am running on qemu vm, similar to ci environment but may have a slightly different config. The CI for this patch set won't work since the sleepable kfunc support patch is not available. Once you have that patch, bpf CI should be able to compile the patch set and run the tests.
quoted
quoted
quoted
Also an existing test also failed. btf_dump_data:PASS:find type id 0 nsec btf_dump_data:PASS:failed/unexpected type_sz 0 nsec btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(union bpf_iter_link_info){.map = (struct){.map_fd = (__u32)1,},.cgroup ' test_btf_dump_struct_data:PASS:find struct sk_buff 0 nsecYeah I see what happened there. bpf_iter_link_info was changed by the patch that introduced cgroup_iter, and this specific union is used by the test to test the "union with nested struct" btf dumping. I will add a patch in the next version that updates the btf_dump_data test accordingly. Thanks.quoted
test_btf_dump_struct_data:PASS:unexpected return value dumping sk_buff 0 nsec btf_dump_data:PASS:verify prefix match 0 nsec btf_dump_data:PASS:find type id 0 nsec btf_dump_data:PASS:failed to return -E2BIG 0 nsec btf_dump_data:PASS:ensure expected/actual match 0 nsec btf_dump_data:PASS:verify prefix match 0 nsec btf_dump_data:PASS:find type id 0 nsec btf_dump_data:PASS:failed to return -E2BIG 0 nsec btf_dump_data:PASS:ensure expected/actual match 0 nsec #21/14 btf_dump/btf_dump: struct_data:FAIL please take a look.quoted
--- .../prog_tests/cgroup_hierarchical_stats.c | 351 ++++++++++++++++++ .../bpf/progs/cgroup_hierarchical_stats.c | 234 ++++++++++++ 2 files changed, 585 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/cgroup_hierarchical_stats.c create mode 100644 tools/testing/selftests/bpf/progs/cgroup_hierarchical_stats.c
[...]