Thread (22 messages) 22 messages, 3 authors, 2023-08-02

Re: [RFC PATCH v3 3/6] sched, tracing: add to report task state in symbolic chars

From: Ze Gao <hidden>
Date: 2023-08-01 13:04:09
Also in: linux-perf-users, lkml

On Tue, Aug 1, 2023 at 7:34 PM Peter Zijlstra [off-list ref] wrote:
On Tue, Aug 01, 2023 at 05:01:21PM +0800, Ze Gao wrote:
quoted
Internal representations of task state are likely to be changed
or ordered, and reporting them to userspace without exporting
them as part of API is basically wrong, which can easily break
a userspace observability tool as kernel evolves. For example,
perf suffers from this and still reports wrong states as of this
writing.

OTOH, some masqueraded states like TASK_REPORT_IDLE and
TASK_REPORT_MAX are also reported inadvertently, which confuses
things even more and most userspace tools do not even take them
into consideration.

So add a new variable in company with the old raw value to
report task state in symbolic chars, which are self-explaining
and no further translation is needed. Of course this does not
break any userspace tool.

Note for PREEMPT_ACTIVE, we introduce 'p' to report it and use
the old conventions for the rest.
*sigh*... just because userspace if daft, we need to change the kernel?
Hi Peter,

Sorry that I don't quite agree with you on this one.

It's just the design that exporting internal details is fundamentally wrong.
And even worse,  I did not see any userspace tool is aware of masqueraded
states like TASK_REPORT_IDLE and TASK_REPORT_MAX and let alone
parse it correctly.  This confused me a lot when I decided to write my own bpf
version of sched-latency analysis tool and only after I figured out everything
underneath, I started to make things right here.

Again, I mean it's not me that deliberately "breaks" ABI here and I am
never meant
to upset anyone.  My confusion is why did people forget to update in-tree perf
the very last time they decide to rearrange the task state mapping
since we all agree
this is important "ABI" here.  I don't think it's the tool's fault.
And that's my initiative
to request this RFC.
Why do we need this character anyway, why not just print the state in
hex and leave it at that? These single character state things are a
relic, please just let them die.
I believe hex is ok only after having the reported task state mapping
appear in the
uapi headers, otherwise it's still useless to userspace especially for
value like
TASK_REPORT_IDLE and TASK_REPORT_MAX, which need to dig into the
kernel to see what the hell is going on here.

Thoughts?

Regards,
Ze
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help