Re: [PATCH v3 1/2] ftrace: disable preemption between ftrace_test_recursion_trylock/unlock()
From: 王贇 <hidden>
Date: 2021-10-15 09:12:35
Also in:
linux-riscv, live-patching, lkml
On 2021/10/15 下午3:28, Petr Mladek wrote:
On Fri 2021-10-15 11:13:08, 王贇 wrote:quoted
On 2021/10/14 下午11:14, Petr Mladek wrote: [snip]quoted
quoted
- return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); + int bit; + + bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX); + /* + * The zero bit indicate we are nested + * in another trylock(), which means the + * preemption already disabled. + */ + if (bit > 0) + preempt_disable_notrace();Is this safe? The preemption is disabled only when trace_test_and_set_recursion() was called by ftrace_test_recursion_trylock(). We must either always disable the preemtion when bit >= 0. Or we have to disable the preemtion already in trace_test_and_set_recursion().Internal calling of trace_test_and_set_recursion() will disable preemption on succeed, it should be safe.trace_test_and_set_recursion() does _not_ disable preemtion! It works only because all callers disable the preemption.
Yup.
It means that the comment is wrong. It is not guarantted that the preemption will be disabled. It works only by chance.quoted
We can also consider move the logical into trace_test_and_set_recursion() and trace_clear_recursion(), but I'm not very sure about that... ftrace internally already make sure preemption disabledHow? Because it calls trace_test_and_set_recursion() via the trylock() API?
I mean currently all the direct caller of trace_test_and_set_recursion() have disabled the preemption as you mentioned above, but yes if anyone later write some kernel code to call trace_test_and_set_recursion() without disabling preemption after, then promise broken.
quoted
, what uncovered is those users who call API trylock/unlock, isn't it?And this is exactly the problem. trace_test_and_set_recursion() is in a public header. Anyone could use it. And if anyone uses it in the future without the trylock() and does not disable the preemtion explicitely then we are lost again. And it is even more dangerous. The original code disabled the preemtion on various layers. As a result, the preemtion was disabled several times for sure. It means that the deeper layers were always on the safe side. With this patch, if the first trace_test_and_set_recursion() caller does not disable preemtion then trylock() will not disable it either and the entire code is procceed with preemtion enabled.
Yes, what confusing me at first is that I think people who call trace_test_and_set_recursion() without trylock() can only be a ftrace hacker, not a user, but in case if anyone can use it without respect to preemption stuff, then I think the logical should be inside trace_test_and_set_recursion() rather than trylock().
quoted
quoted
Finally, the comment confused me a lot. The difference between nesting and recursion is far from clear. And the code is tricky liky like hell :-) I propose to add some comments, see below for a proposal.The comments do confusing, I'll make it something like: The zero bit indicate trace recursion happened, whatever the recursively call was made by ftrace handler or ftrace itself, the preemption already disabled.I am sorry but it is still confusing. We need to find a better way how to clearly explain the difference between the safe and unsafe recursion. My understanding is that the recursion is: + "unsafe" when the trace code recursively enters the same trace point. + "safe" when ftrace_test_recursion_trylock() is called recursivelly while still processing the same trace entry.
Maybe take some example would be easier to understand... Currently there are two way of using ftrace_test_recursion_trylock(), one with TRACE_FTRACE_XXX we mark as A, one with TRACE_LIST_XXX we mark as B, then: A followed by B on same context got bit > 0 B followed by A on any context got bit 0 A followed by A on same context got bit > 0 A followed by A followed by A on same context got bit -1 B followed by B on same context got bit > 0 B followed by B followed by B on same context got bit -1 If we get rid of the TRACE_TRANSITION_BIT which allowed recursion for onetime, then it would be: A followed by B on same context got bit > 0 B followed by A on any context got bit 0 A followed by A on same context got bit -1 B followed by B on same context got bit -1 So as long as no continuously AAA it's safe?
quoted
quoted
quoted
+ + return bit; } /**@@ -222,9 +233,13 @@ static __always_inline int ftrace_test_recursion_trylock(unsigned long ip, * @bit: The return of a successful ftrace_test_recursion_trylock() * * This is used at the end of a ftrace callback. + * + * Preemption will be enabled (if it was previously enabled). */ static __always_inline void ftrace_test_recursion_unlock(int bit) { + if (bit)This is not symetric with trylock(). It should be: if (bit > 0) Anyway, trace_clear_recursion() quiently ignores bit != 0Yes, bit == 0 should not happen in here.Yes, it "should" not happen. My point is that we could make the API more safe. We could do the right thing when ftrace_test_recursion_unlock() is called with negative @bit. Ideally, we should also warn about the mis-use.
Agree with a WARN here on bit 0.
Anyway, let's wait for Steven. It seems that he found another problem with the API that should be solved first. The fix will probably also help to better understand the "safe" vs "unsafe" recursion.
Cool~ Regards, Michael Wang
Best Regards, Petr