Re: [PATCH V2 4/9] tools/perf: Add support to capture and parse raw instruction in objdump
From: Christophe Leroy <hidden>
Date: 2024-05-07 09:35:24
Also in:
linux-perf-users, lkml
Le 06/05/2024 à 14:19, Athira Rajeev a écrit :
Add support to capture and parse raw instruction in objdump.
What's the purpose of using 'objdump' for reading raw instructions ? Can't they be read directly without invoking 'objdump' ? It looks odd to me to use objdump to provide readable text and then parse it back.
Currently, the perf tool infrastructure uses "--no-show-raw-insn" option with "objdump" while disassemble. Example from powerpc with this option for an instruction address is:
Yes and that makes sense because the purpose of objdump is to provide human readable annotations, not to perform automated analysis. Am I missing something ?
Snippet from:
objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux>
c0000000010224b4: lwz r10,0(r9)
This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset. Also to find whether there is a memory
reference in the operands, "memory_ref_char" field of objdump is used.
For x86, "(" is used as memory_ref_char to tackle instructions of the
form "mov (%rax), %rcx".
In case of powerpc, not all instructions using "(" are the only memory
instructions. Example, above instruction can also be of extended form (X
form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
and extract the source/target registers, patch adds support to use raw
instruction. With raw instruction, macros are added to extract opcode
and register fields.
"struct ins_operands" and "struct ins" is updated to carry opcode and
raw instruction binary code (raw_insn). Function "disasm_line__parse"
is updated to fill the raw instruction hex value and opcode in newly
added fields. There is no changes in existing code paths, which parses
the disassembled code. The architecture using the instruction name and
present approach is not altered. Since this approach targets powerpc,
the macro implementation is added for powerpc as of now.
Example:
representation using --show-raw-insn in objdump gives result:
38 01 81 e8 ld r4,312(r1)
Here "38 01 81 e8" is the raw instruction representation. In powerpc,
this translates to instruction form: "ld RT,DS(RA)" and binary code
as:
_____________________________________
| 58 | RT | RA | DS | |
-------------------------------------
0 6 11 16 30 31
Function "disasm_line__parse" is updated to capture:
line: 38 01 81 e8 ld r4,312(r1)
opcode and raw instruction "38 01 81 e8"
Raw instruction is used later to extract the reg/offset fields.
Signed-off-by: Athira Rajeev <redacted>
---