Re: [PATCH] tools/lib/perf: Fix -Werror=alloc-size-larger-than in cpumap.c
From: Ian Rogers <irogers@google.com>
Date: 2025-05-21 17:39:18
Also in:
linux-perf-users
On Wed, May 21, 2025 at 10:28 AM Likhitha Korrapati [off-list ref] wrote:
quoted hunk ↗ jump to hunk
Hi Ian, On 5/21/25 21:15, Ian Rogers wrote:quoted
On Wed, May 21, 2025 at 6:03 AM Likhitha Korrapati [off-list ref] wrote:quoted
Hi Arnaldo, On 5/14/25 02:43, Arnaldo Carvalho de Melo wrote:quoted
On Fri, May 02, 2025 at 01:14:32PM +0530, Mukesh Kumar Chaurasiya wrote:quoted
On Fri, Apr 25, 2025 at 02:46:43PM -0300, Arnaldo Carvalho de Melo wrote:quoted
Maybe that max() call in perf_cpu_map__intersect() somehow makes the compiler happy.quoted
quoted
And in perf_cpu_map__alloc() all calls seems to validate it.quoted
quoted
Like:quoted
quoted
+++ b/tools/lib/perf/cpumap.c@@ -411,7 +411,7 @@ int perf_cpu_map__merge(struct perf_cpu_map **orig, struct perf_cpu_map *other) } tmp_len = __perf_cpu_map__nr(*orig) + __perf_cpu_map__nr(other); - tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); + tmp_cpus = calloc(tmp_len, sizeof(struct perf_cpu)); if (!tmp_cpus) return -ENOMEM;quoted
quoted
⬢ [acme@toolbx perf-tools-next]$quoted
quoted
And better, do the max size that the compiler is trying to help us catch?quoted
Isn't it better to use perf_cpu_map__nr. That should fix this problem.Maybe, have you tried it?I have tried this method and it works.--- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c@@ -410,7 +410,7 @@ int perf_cpu_map__merge(struct perf_cpu_map **orig,struct perf_cpu_map *other) return 0; } - tmp_len = max(__perf_cpu_map__nr(*orig), __perf_cpu_map__nr(other)); + tmp_len = perf_cpu_map__nr(*orig) + perf_cpu_map__nr(other); tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); if (!tmp_cpus) return -ENOMEM; I will send a V2 with this change if this looks good.How is this different from the existing code: https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/cpumap.c?h=perf-tools-next#n423tmp_len = __perf_cpu_map__nr(*orig) + __perf_cpu_map__nr(other); tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); if (!tmp_cpus) return -ENOMEM;Thanks, IanI gave the wrong diff. Here is the corrected diff.--- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c@@ -410,7 +410,7 @@ int perf_cpu_map__merge(struct perf_cpu_map **orig,struct perf_cpu_map *other) return 0; } - tmp_len = __perf_cpu_map__nr(*orig) + __perf_cpu_map__nr(other); + tmp_len = perf_cpu_map__nr(*orig) + perf_cpu_map__nr(other); tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); if (!tmp_cpus) return -ENOMEM; I am using perf_cpu_map__nr instead of __perf_cpu_map__nr.
Ok, why is that a fix? The function declarations are near identical and perf_cpu_map__nr is implemented in terms of __perf_cpu_map__nr:
static int __perf_cpu_map__nr(const struct perf_cpu_map *cpus)
{
return RC_CHK_ACCESS(cpus)->nr;
}
int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
{
return cpus ? __perf_cpu_map__nr(cpus) : 1;
}
My guess is that being static allows all of the code to be analyzed in the compilation unit and thereby create the warning/error, your change is just defeating the analysis. The analysis could easily kick in again for Link Time Optimization. I'd prefer making these `__nr` functions return `unsigned int` or size_t over changes like this. Thanks, Ian
Thanks, Likhitha.quoted
quoted
Thanks Likhitha.quoted
quoted
One question I have, in perf_cpu_map__nr, the function is returning 1 in case *cpus is NULL. Is it ok to do that? wouldn't it cause problems?Indeed this better be documented, as by just looking at: int perf_cpu_map__nr(const struct perf_cpu_map *cpus) { return cpus ? __perf_cpu_map__nr(cpus) : 1; } It really doesn't make much sense to say that a NULL map has one entry. But the next functions are: bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map) { return map ? __perf_cpu_map__cpu(map, 0).cpu == -1 : true; } bool perf_cpu_map__is_any_cpu_or_is_empty(const struct perf_cpu_map *map) { if (!map) return true; return __perf_cpu_map__nr(map) == 1 && __perf_cpu_map__cpu(map, 0).cpu == -1; } bool perf_cpu_map__is_empty(const struct perf_cpu_map *map) { return map == NULL; } So it seems that a NULL cpu map means "any/all CPU) and a map with just one entry would have as its content "-1" that would mean "any/all CPU". Ian did work on trying to simplify/clarify this, so maybe he can chime in :-) - Arnaldo