Thread (25 messages) 25 messages, 2 authors, 2022-09-30

Re: [PATCH v4 4/8] kallsyms: Improve the performance of kallsyms_lookup_name()

From: Petr Mladek <pmladek@suse.com>
Date: 2022-09-22 13:18:24
Also in: live-patching, lkml

On Thu 2022-09-22 15:21:57, Leizhen (ThunderTown) wrote:

On 2022/9/22 15:02, Petr Mladek wrote:
quoted
On Thu 2022-09-22 10:15:22, Leizhen (ThunderTown) wrote:
quoted

On 2022/9/21 23:25, Petr Mladek wrote:
quoted
On Tue 2022-09-20 15:13:13, Zhen Lei wrote:
quoted
Currently, to search for a symbol, we need to expand the symbols in
'kallsyms_names' one by one, and then use the expanded string for
comparison. This process can be optimized.

And now scripts/kallsyms no longer compresses the symbol types, each
symbol type always occupies one byte. So we can first compress the
searched symbol and then make a quick comparison based on the compressed
length and content. In this way, for entries with mismatched lengths,
there is no need to expand and compare strings. And for those matching
lengths, there's no need to expand the symbol. This saves a lot of time.
According to my test results, the average performance of
kallsyms_lookup_name() can be improved by 20 to 30 times.

The pseudo code of the test case is as follows:
static int stat_find_name(...)
{
	start = sched_clock();
	(void)kallsyms_lookup_name(name);
	end = sched_clock();
	//Update min, max, cnt, sum
}

/*
 * Traverse all symbols in sequence and collect statistics on the time
 * taken by kallsyms_lookup_name() to lookup each symbol.
 */
kallsyms_on_each_symbol(stat_find_name, NULL);

The test results are as follows (twice):
After : min=5250, max=  726560, avg= 302132
After : min=5320, max=  726850, avg= 301978
Before: min=170,  max=15949190, avg=7553906
Before: min=160,  max=15877280, avg=7517784

The average time consumed is only 4.01% and the maximum time consumed is
only 4.57% of the time consumed before optimization.

Signed-off-by: Zhen Lei <redacted>
---
 kernel/kallsyms.c | 79 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 76 insertions(+), 3 deletions(-)
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 3e7e2c2ad2f75ef..2d76196cfe89f34 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -87,6 +87,71 @@ static unsigned int kallsyms_expand_symbol(unsigned int off,
+{
+	int i, j, k, n;
+	int len, token_len;
+	const char *token;
+	unsigned char token_idx[KSYM_NAME_LEN];
+	unsigned char token_bak[KSYM_NAME_LEN];
Why do we need two buffers? It should be possible to compress the name
in the same buffer as it is done in compress_symbols() in scripts/callsyms.c.
Because the performance would be a little better. Now this function takes
just over a microsecond. Currently, it takes about 250 microseconds on
average to lookup a symbol, so adding a little more time to this function
doesn't affect the overall picture. I'll modify and test it as you suggest
below.
We need to be careful about a stack overflow. I have seen that
KSYM_NAME_LEN might need to be increased to 512 because of
Rust support, see
https://lore.kernel.org/r/20220805154231.31257-6-ojeda@kernel.org (local)
OK. Thanks for your information. I decided to add kallsyms_best_token_table[],
kallsyms_best_token_table_len, so that we only need one namebuf[], like
kallsyms_expand_symbol().
Thanks for the effort. Adding kallsyms_best_token_table[] sounds like
the right solution.

Best Regards,
Petr
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help