Thread (14 messages) 14 messages, 3 authors, 2025-08-25

Re: [PATCH v3 2/2] tracing/preemptirq: Optimize preempt_disable/enable() tracepoint overhead

From: Wander Lairson Costa <hidden>
Date: 2025-08-01 13:30:23
Also in: lkml

On Tue, Jul 8, 2025 at 3:54 PM Peter Zijlstra [off-list ref] wrote:
On Tue, Jul 08, 2025 at 09:54:06AM -0300, Wander Lairson Costa wrote:
quoted
O Mon, Jul 07, 2025 at 01:20:03PM +0200, Peter Zijlstra wrote:
quoted
On Fri, Jul 04, 2025 at 02:07:43PM -0300, Wander Lairson Costa wrote:
quoted
Similar to the IRQ tracepoint, the preempt tracepoints are typically
disabled in production systems due to the significant overhead they
introduce even when not in use.

The overhead primarily comes from two sources: First, when tracepoints
are compiled into the kernel, preempt_count_add() and preempt_count_sub()
become external function calls rather than inlined operations. Second,
these functions perform unnecessary preempt_count() checks even when the
tracepoint itself is disabled.

This optimization introduces an early check of the tracepoint static key,
which allows us to skip both the function call overhead and the redundant
preempt_count() checks when tracing is disabled. The change maintains all
existing functionality when tracing is active while significantly
reducing overhead for the common case where tracing is inactive.
This one in particular I worry about the code gen impact. There are a
*LOT* of preempt_{dis,en}able() sites in the kernel and now they all get
this static branch and call crud on.

We spend significant effort to make preempt_{dis,en}able() as small as
possible.
Thank you for the feedback, it's much appreciated. I just want to make sure
I'm on the right track. If I understand your concern correctly, it revolves
around the overhead this patch might introduce???specifically to the binary
size and its effect on the iCache???when the kernel is built with preempt
tracepoints enabled. Is that an accurate summary?
Yes, specifically:

preempt_disable()
        incl    %gs:__preempt_count



preempt_enable()
        decl    %gs:__preempt_count
        jz      do_schedule
1:      ...

do_schedule:
        call    __SCT__preemptible_schedule
        jmp     1


your proposal adds significantly to this.
Here is a breakdown of the patch's behavior under the different kernel
configurations:
* When DEBUG_PREEMPT is defined, the behavior is identical to the
current implementation, with calls to preempt_count_add/sub().
* When both DEBUG_PREEMPT and TRACE_PREEMPT_TOGGLE are disabled, the
generated code is also unchanged.
* The primary change occurs when only TRACE_PREEMPT_TOGGLE is defined.
In this case, the code uses a static key test instead of a function
call. As the benchmarks show, this approach is faster when the
tracepoints are disabled.
The main trade-off is that enabling or disabling these tracepoints
will require the kernel to patch more code locations due to the use of
static keys.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help