Re: [PATCH 4/7] powerpc/ftrace: Additionally nop out the preceding mflr with -mprofile-kernel
From: Nicholas Piggin <npiggin@gmail.com>
Date: 2019-06-19 07:16:08
Also in:
lkml
Michael Ellerman's on June 19, 2019 3:14 pm:
Hi Naveen, Sorry I meant to reply to this earlier .. :/ "Naveen N. Rao" [off-list ref] writes:quoted
With -mprofile-kernel, gcc emits 'mflr r0', followed by 'bl _mcount' to enable function tracing and profiling. So far, with dynamic ftrace, we used to only patch out the branch to _mcount(). However, mflr is executed by the branch unit that can only execute one per cycle on POWER9 and shared with branches, so it would be nice to avoid it where possible. We cannot simply nop out the mflr either. When enabling function tracing, there can be a race if tracing is enabled when some thread was interrupted after executing a nop'ed out mflr. In this case, the thread would execute the now-patched-in branch to _mcount() without having executed the preceding mflr. To solve this, we now enable function tracing in 2 steps: patch in the mflr instruction, use synchronize_rcu_tasks() to ensure all existing threads make progress, and then patch in the branch to _mcount(). We override ftrace_replace_code() with a powerpc64 variant for this purpose.According to the ISA we're not allowed to patch mflr at runtime. See the section on "CMODX".
According to "quasi patch class" engineering note, we can patch anything with a preferred nop. But that's written as an optional facility, which we don't have a feature to test for.
I'm also not convinced the ordering between the two patches is guaranteed by the ISA, given that there's possibly no isync on the other CPU.
Will they go through a context synchronizing event? synchronize_rcu_tasks() should ensure a thread is scheduled away, but I'm not actually sure it guarantees CSI if it's kernel->kernel. Could do a smp_call_function to do the isync on each CPU to be sure. Thanks, Nick