Re: 答复: new seccomp mode aims to improve performance

From: Alexei Starovoitov <hidden>
Date: 2020-06-02 03:24:51
Also in: bpf

On Tue, Jun 02, 2020 at 02:42:35AM +0000, zhujianwei (C) wrote:

quoted

This is the test result on linux 5.7.0-rc7 for aarch64.
And retpoline disabled default.
#cat /sys/devices/system/cpu/vulnerabilities/spectre_v2
Not affected

bpf_jit_enable 1
bpf_jit_harden 0

We run unixbench syscall benchmark on the original kernel and the new one(replace bpf_prog_run_pin_on_cpu() with immediately returning 'allow' one).
The unixbench syscall testcase runs 5 system calls（close/umask/dup/getpid/getuid, extra 15 syscalls needed to run it） in a loop for 10 seconds, counts the number and finally output it. We also add some more filters (each with the same rules) to evaluate the situation just like kees mentioned(case like systemd-resolve), and we find it is right: more filters, more overhead. The following is our result (./syscall 10 m):

original:
        seccomp_off:                    10684939
        seccomp_on_1_filters:   8513805         overhead：19.8%
        seccomp_on_4_filters:   7105592         overhead：33.0%
        seccomp_on_32_filters:  2308677         overhead：78.3%

after replacing bpf_prog_run_pin_on_cpu:
        seccomp_off:                    10685244
        seccomp_on_1_filters:   9146483         overhead：14.1%
        seccomp_on_4_filters:   8969886         overhead：16.0%
        seccomp_on_32_filters:  6454372         overhead：39.6%

N-filter bpf overhead:
        1_filters:              5.7%
        4_filters:              17.0%
        32_filters:     38.7%

// kernel code modification place
static noinline u32 bpf_prog_run_pin_on_cpu_allow(const struct 
bpf_prog *prog, const void *ctx) {
        return SECCOMP_RET_ALLOW;
}

quoted

This is apples to oranges.
As explained earlier:
https://lore.kernel.org/netdev/20200531171915.wsxvdjeetmhpsdv2@ast-mbp.dhcp.thefacebook.com/T/#u (local)
Please use __weak instead of static and redo the numbers.


we have replaced ‘static’ with ‘__weak’, tested with the same way, and got almostly the same result, in our test environment(aarch64).

-static noinline u32 bpf_prog_run_pin_on_cpu_allow(const struct bpf_prog *prog, const void *ctx)
+__weak noinline u32 bpf_prog_run_pin_on_cpu_allow(const struct bpf_prog *prog, const void *ctx)

original:
	seccomp_off:			10684939
	seccomp_on_1_filters:	8513805		overhead：19.8%
	seccomp_on_4_filters:	7105592		overhead：33.0%
	seccomp_on_32_filters:	2308677		overhead：78.3%
	
after replacing bpf_prog_run_pin_on_cpu:
	seccomp_off:			10667195
	seccomp_on_1_filters:	9147454		overhead：14.2%
	seccomp_on_4_filters:	8927605		overhead：16.1%
	seccomp_on_32_filters:	6355476		overhead：40.6%

are you saying that by replacing 'static' with '__weak' it got slower?!
Something doesn't add up. Please check generated assembly.
By having such 'static noinline bpf_prog_run_pin_on_cpu' you're telling
compiler to remove most of seccomp_run_filters() code which now will
return only two possible values. Which further means that large 'switch'
statement in __seccomp_filter() is also optimized. populate_seccomp_data()
is removed. Etc, etc. That explains 14% vs 19% difference.
May be you have some debug on? Like cant_migrate() is not a nop?
Or static_branch is not supported?
The sure way is to check assembly.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help