Re: [PATCH] ftrace: Avoid potential division by zero in function_stat_show()
From: Steven Rostedt <rostedt@goodmis.org>
Date: 2025-02-04 22:20:10
Also in:
lkml
Subsystem:
function hooks (ftrace), the rest, tracing · Maintainers:
Steven Rostedt, Masami Hiramatsu, Linus Torvalds
On Tue, 4 Feb 2025 10:43:02 +0300 Nikolay Kuratov [off-list ref] wrote:
Thank you for the review! `counter > rec->counter` check does not protect us from overflows, so this could mislead especially with the comment included.
Ah you're right, as there's a big multiplication there.
I think we should either eliminate overflows or only protect from zero division. To prevent overflows it is wise to split the division into three and forget about it. But in 32-bit case overflows will still occur. As for zero division protection, maybe this variant even more concise? #ifdef CONFIG_FUNCTION_GRAPH_TRACER seq_puts(m, " "); stddev = 0; unsigned long denom = rec->counter * (rec->counter - 1) * 1000;
Actually, if the denom overflows, then this is bogus anyway. We should stop calculating if it overflows. That would be if: rec->counter * (rec->counter - 1) * 1000 >= 2^32 or >= 2^64 depending on the architecture. So we can calculate what value rec->counter has to be for that to happen. Using the quadratic equation (haven't used this since college!), we can figure that out. x = rec->counter x * (x - 1) * 1000 = (2^32 - 1) // use minus 1 just to be sure x * (x - 1) = (2^32 - 1) / 1000 x^2 - x = (2^32 - 1) / 1000 x^2 - x - (2^32 - 1) / 1000 = 0 x = (-b +/- sqrt(b^2 - 4ac)) / 2a a = 1 b = -1 c = -(2^32 - 1) / 1000 = -4294967.295 x = (-1 +/- sqrt((-1)^2 - 4 * -4294967.295)) / 2 x = 2071.930 for 32bit For 64bit we have c = -(2^64 - 1) / 1000 = -18446744073709551.615 That makes x = 135818790.812 We should stop calculating stddev when rec->counter > 2071 on 32 bit and > 135818790 on 64 bit! Feel free to check my math. The below patch should fix this. -- Steve
if (denom) {
stddev = rec->counter * rec->time_squared -
rec->time * rec->time;
stddev = div64_ul(stddev, denom);
}
trace_seq_init(&s);
trace_print_graph_duration(rec->time, &s);diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 2e113f8b13a2..ae90443f32af 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c@@ -419,10 +419,42 @@ struct ftrace_profile { unsigned long counter; #ifdef CONFIG_FUNCTION_GRAPH_TRACER unsigned long long time; + unsigned long long stddev_time; unsigned long long time_squared; #endif }; +/* + * The calculation for stddev will overflow when the counter + * algorithm overflows the long size: + * + * rec->counter * (rec->counter - 1) * 1000 >= 2^BITS_PER_LONG + * + * Using the quadratic equation: x = (-b +/- sqrt(b^2 - 4ac)) / 2a + * we can figure out what the max rec->counter is before it + * overflows. + * + * x = rec->counter + * x * (x - 1) * 1000 = 2^BITS_PER_LONG - 1 // -1 for rounding + * x * (x - 1) = (2^BITS_PER_LONG - 1) / 1000 + * x^2 - x = (2^BITS_PER_LONG - 1) / 1000 + * x^2 - x - (2^BITS_PER_LONG - 1) / 1000 = 0 + * + * a = 1 + * b = -1 + * c = -(2^BITS_PER_LONG - 1) / 1000 + * + * x = (1 +/- sqrt(1 - 4 * -(2^BITS_PER_LONG - 1) / 1000)) / 2 + * + * For 32bit that is: x = 2072.930 (or 2072) + * For 64bit that is: x = 135818791.812 (or 135818791) + */ +#ifdef CONFIG_64BIT +# define MAX_STDDEV_COUNTER 135818791 +#else +# define MAX_STDDEV_COUNTER 2072 +#endif + struct ftrace_profile_page { struct ftrace_profile_page *next; unsigned long index;
@@ -566,12 +598,16 @@ static int function_stat_show(struct seq_file *m, void *v) if (rec->counter <= 1) stddev = 0; else { + unsigned long counter; + + counter = rec->counter < MAX_STDDEV_COUNTER ? rec->counter : + MAX_STDDEV_COUNTER - 1; /* * Apply Welford's method: * s^2 = 1 / (n * (n-1)) * (n * \Sum (x_i)^2 - (\Sum x_i)^2) */ - stddev = rec->counter * rec->time_squared - - rec->time * rec->time; + stddev = counter * rec->time_squared - + rec->stddev_time * rec->stddev_time; /* * Divide only 1000 for ns^2 -> us^2 conversion.
@@ -894,7 +930,19 @@ static void profile_graph_return(struct ftrace_graph_ret *trace, rec = ftrace_find_profiled_func(stat, trace->func); if (rec) { rec->time += calltime; - rec->time_squared += calltime * calltime; + /* + * This is used to calculate stddev, but the calculation + * uses Apply Welford's method that will multiply + * rec->counter * (rec->counter - 1) and also divide from + * that. The calculation will be bogus if that value overflows + * the long size it is stored in. Stop the calculation + * when the counter hits the point it will overflow. + * The stddev shouldn't change much after that anyway. + */ + if (rec->counter < MAX_STDDEV_COUNTER) { + rec->stddev_time = rec->time; + rec->time_squared += calltime * calltime; + } } out: