Re: [PATCH 0/3] arm64/ptrace: allow to get all registers on syscall traps
From: Andrei Vagin <hidden>
Date: 2021-01-27 08:16:30
Also in:
lkml
On Tue, Jan 19, 2021 at 2:08 PM Andrei Vagin [off-list ref] wrote:
Right now, ip/r12 for AArch32 and x7 for AArch64 is used to indicate whether or not the stop has been signalled from syscall entry or syscall exit. This means that: - Any writes by the tracer to this register during the stop are ignored/discarded. - The actual value of the register is not available during the stop, so the tracer cannot save it and restore it later. This series introduces NT_ARM_PRSTATUS to get all registers and makes it possible to change ip/r12 and x7 registers when tracee is stopped in syscall traps. For applications like the user-mode Linux or gVisor, it is critical to have access to the full set of registers at any moment. For example, they need to change values of all registers to emulate rt_sigreturn and they need to have the full set of registers to build a signal frame.
I have found the thread [1] where Keno, Will, and Dave discussed the same problem. If I understand this right, the problem was not fixed, because there were no users who needed it. gVisor is a general-purpose sandbox to run untrusted workloads. It has a platform interface that is responsible for syscall interception, context switching, and managing process address spaces. Right now, we have kvm and ptrace platforms. The ptrace platform runs a guest code in the context of stub processes and intercepts syscalls with help of PTRACE_SYSEMU. All system calls are handled by the gVisor kernel including rt_sigreturn and execve. Signal handling is happing inside the gVisor kernel too. Each stub process can have more than one thread, but we don't bind guest threads to stub threads and we can run more than one guest thread in the context of one stub thread. Taking into account all these facts, we need to have access to all registers at any moment when a stub thread has been stopped. We were able to introduce the workaround [3] for this issue. Each time when a stub process is stopped on a system call, we queue a fake signal and resume a process to stop it on the signal. It works, but we need to do extra interaction with a stub process what is expensive. My benchmarks show that this workaround slows down syscalls in gVisor for more than 50%. BTW: it is one of the major reasons why PTRACE_SYSEMU was introduced instead of emulating it via two calls of PTRACE_SYSCALL. [1] https://lore.kernel.org/lkml/CABV8kRz0mKSc=u1LeonQSLroKJLOKWOWktCoGji2nvEBc=e7=w@mail.gmail.com/#r (local) [2] https://github.com/google/gvisor/issues/5238 [3] https://github.com/google/gvisor/commit/a44efaab6d4b815880749a39647fb3ed9634a489
Andrei Vagin (3): arm64/ptrace: don't clobber task registers on syscall entry/exit traps arm64/ptrace: introduce NT_ARM_PRSTATUS to get a full set of registers selftest/arm64/ptrace: add a test for NT_ARM_PRSTATUS arch/arm64/include/asm/ptrace.h | 5 + arch/arm64/kernel/ptrace.c | 130 +++++++++++----- include/uapi/linux/elf.h | 1 + tools/testing/selftests/arm64/Makefile | 2 +- tools/testing/selftests/arm64/ptrace/Makefile | 6 + .../arm64/ptrace/ptrace_syscall_regs_test.c | 142 ++++++++++++++++++ 6 files changed, 246 insertions(+), 40 deletions(-) create mode 100644 tools/testing/selftests/arm64/ptrace/Makefile create mode 100644 tools/testing/selftests/arm64/ptrace/ptrace_syscall_regs_test.c -- 2.29.2
_______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel