Thread (9 messages) 9 messages, 3 authors, 2021-06-07

Re: [PATCH 0/3] arm64: perf: Make compat tracing better

From: Doug Anderson <dianders@chromium.org>
Date: 2021-06-07 20:44:04
Also in: linux-perf-users, lkml

Hi,

On Wed, Jun 2, 2021 at 10:56 AM Will Deacon [off-list ref] wrote:
Hi Doug,

Thanks for posting this, and sorry for the delay in getting to it.

On Fri, May 07, 2021 at 01:55:10PM -0700, Douglas Anderson wrote:
quoted
The goal for this series is to improve "perf" behavior when 32-bit
userspace code is involved. This turns out to be fairly important for
Chrome OS which still runs 32-bit userspace for the time being (long
story there).
Watch out, your days are numbered! See [1].
Yeah, folks on the Chrome OS team are aware and we're trying our
darndest to move away. It's been an unfortunate set of circumstances
that has kept us on 32-bit this long. :( BTW: I like your suggestion
of "retirement" as a solution to dealing with this problem, but I'm
not quite ready to retire yet.

quoted
I won't repeat everything said in the individual patches since since
they are wordy enough as it is.

Please enjoy and I hope this isn't too ugly/hacky for inclusion in
mainline.

Thanks to Nick Desaulniers for his early review of these patches and
to Ricky for the super early prototype that some of this is based on.
I can see that you've put a lot of effort into this, but I'm not thrilled
with the prospect of maintaining these heuristics in the kernel. The
callchain behaviour is directly visible to userspace, and all we'll be able
to do is throw more heuristics at it if faced with any regression reports.
Every assumption made about userspace behaviour results in diminishing
returns where some set of programs no longer fall into the "supported"
bucket and, on balance, I don't think the trade-off is worth it.

If we were to do this in the kernel, then I'd like to see a spec for how
frame-pointer based unwinding should work for Thumb and have it agreed
upon and implemented by both GCC and LLVM. That way, we can implement
the unwinder according to that spec and file bug reports against the
compiler if it goes wrong.
Given how long this has been going on, I'd somewhat guess that getting
this implemented in GCC and LLVM is 1+ year out. Presumably Chrome OS
will be transitioned off 32-bit ARM by then.

In lieu of that, I think we must defer to userspace to unwind using DWARF.
Perf supports this via PERF_SAMPLE_STACK_USER and PERF_SAMPLE_REGS_USER,
which allows libunwind to be used to create the callchain. You haven't
mentioned that here, so I'd be interested to know why not.
Good point. So I guess I didn't mention it because:

a) I really know very little about perf. I got roped in this because I
understand stack unwinding, not because I know how to use perf well.
:-P So I personally have no idea how to set this up.

b) In the little bit of reading I did about this, people seemed to say
that using libunwind for perf sampling was just too slow / too much
overhead.

Finally, you've probably noticed that our unwinding code for compat tasks
is basically identical to the code in arch/arm/. If the functionality is
going to be extended, it should be done there first and then we will follow
to be compatible.
That's fair. I doubt that submitting patches to this area of code for
arm32 would be enjoyable, so I'll pass if it's all the same.

Given your feedback, I think it's fair to consider ${SUBJECT} patch
abandoned then. I'll see if people want to land it as a private patch
in the Chrome OS tree for the time being until we can more fully
abandon arm32 support or until the ARM teams working on gcc and clang
come up with a standard that we can support more properly.

In the meantime, if anyone cares to pick this patch up and move
forward, feel free to do so with my blessing.

-Doug

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help