Thread (7 messages) 7 messages, 3 authors, 2022-03-31

Re: [PATCH bpf] tools/runqslower: fix handle__sched_switch for updated tp sched_switch

From: Song Liu <hidden>
Date: 2022-03-30 06:43:31
Also in: bpf

On Mar 29, 2022, at 7:47 PM, Andrii Nakryiko [off-list ref] wrote:

On Tue, Mar 29, 2022 at 5:39 PM Song Liu [off-list ref] wrote:
quoted

quoted
On Mar 29, 2022, at 5:00 PM, Andrii Nakryiko [off-list ref] wrote:

On Tue, Mar 29, 2022 at 4:19 PM Song Liu [off-list ref] wrote:
quoted
TP_PROTO of sched_switch is updated with a new arg prev_state, which
causes runqslower load failure:

libbpf: prog 'handle__sched_switch': BPF program load failed: Permission denied
libbpf: prog 'handle__sched_switch': -- BEGIN PROG LOAD LOG --
R1 type=ctx expected=fp
0: R1=ctx(off=0,imm=0) R10=fp0
; int handle__sched_switch(u64 *ctx)
0: (bf) r7 = r1                       ; R1=ctx(off=0,imm=0) R7_w=ctx(off=0,imm=0)
; struct task_struct *next = (struct task_struct *)ctx[2];
1: (79) r6 = *(u64 *)(r7 +16)
func 'sched_switch' arg2 has btf_id 186 type STRUCT 'task_struct'
2: R6_w=ptr_task_struct(off=0,imm=0) R7_w=ctx(off=0,imm=0)
; struct task_struct *prev = (struct task_struct *)ctx[1];
2: (79) r2 = *(u64 *)(r7 +8)          ; R2_w=scalar() R7_w=ctx(off=0,imm=0)
3: (b7) r1 = 0                        ; R1_w=0
; struct runq_event event = {};
4: (7b) *(u64 *)(r10 -8) = r1         ; R1_w=P0 R10=fp0 fp-8_w=00000000
5: (7b) *(u64 *)(r10 -16) = r1        ; R1_w=P0 R10=fp0 fp-16_w=00000000
6: (7b) *(u64 *)(r10 -24) = r1        ; R1_w=P0 R10=fp0 fp-24_w=00000000
7: (7b) *(u64 *)(r10 -32) = r1        ; R1_w=P0 R10=fp0 fp-32_w=00000000
; if (prev->__state == TASK_RUNNING)
8: (61) r1 = *(u32 *)(r2 +24)
R2 invalid mem access 'scalar'
processed 9 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
-- END PROG LOAD LOG --
libbpf: failed to load program 'handle__sched_switch'
libbpf: failed to load object 'runqslower_bpf'
libbpf: failed to load BPF skeleton 'runqslower_bpf': -13
failed to load BPF object: -13

Update runqslower to fix this issue. Also, as we are on this, use BPF_PROG
in runqslower for cleaner code.

Fixes: fa2c3254d7cf ("sched/tracing: Don't re-read p->state when emitting sched_switch event")
Signed-off-by: Song Liu <song@kernel.org>
---
tools/bpf/runqslower/runqslower.bpf.c | 19 +++++--------------
1 file changed, 5 insertions(+), 14 deletions(-)
It would be much less disruptive if that prev_state was added after
"next", but oh well...
Maybe we should change that.

+Valentin and Steven, how about we change the order with the attached
diff (not the original patch in this thread, but the one at the end of
this email)?
quoted
But anyways, let's handle this in a way that can handle both old
kernels and new ones and do the same change in libbpf-tool's
runqslower ([0]). Can you please follow up there as well?
Yeah, I will also fix that one.
Thanks!
quoted
quoted

We can use BPF CO-RE to detect which order of arguments running kernel
has by checking prev_state field existence in struct
trace_event_raw_sched_switch. Can you please try that? Use
bpf_core_field_exists() for that.
Do you mean something like

if (bpf_core_field_exists(ctx->prev_state))
   /* use ctx[2] and ctx[3] */
else
   /* use ctx[1] and ctx[2] */
yep, that's what I meant, except you don't have ctx->prev_state, you have to do:

if (bpf_core_field_exists(((struct trace_event_raw_sched_switch
*)0)->prev_state))
quoted
? I think we will need BTF for the arguments, which doesn't exist yet.
Did I miss something?
Probably :) struct trace_event_raw_sched_switch is in vmlinux.h
already for non-raw sched:sched_switch tracepoint. We just use that
type information for feature probing. From BPF verifier's perspective,
ctx[1] or ctx[2] will have proper BTF information (task_struct) during
program validation.
Sigh. I run pahole and saw trace_event_raw_sched_switch. Somehow I 
thought it was not the one. 

Will send v2 tomorrow. 

Thanks,
Song
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help