Re: [PATCH v2 0/4] [RFC] Implement Trampoline File Descriptor
From: Arvind Sankar <hidden>
Date: 2020-09-23 09:11:34
Also in:
linux-api, linux-fsdevel, linux-integrity, linux-security-module, lkml
On Tue, Sep 22, 2020 at 09:46:16PM -0400, Arvind Sankar wrote:
On Thu, Sep 17, 2020 at 10:36:02AM -0500, Madhavan T. Venkataraman wrote:quoted
On 9/16/20 8:04 PM, Florian Weimer wrote:quoted
* madvenka:quoted
Examples of trampolines ======================= libffi (A Portable Foreign Function Interface Library): libffi allows a user to define functions with an arbitrary list of arguments and return value through a feature called "Closures". Closures use trampolines to jump to ABI handlers that handle calling conventions and call a target function. libffi is used by a lot of different applications. To name a few: - Python - Java - Javascript - Ruby FFI - Lisp - Objective Clibffi does not actually need this. It currently collocates trampolines and the data they need on the same page, but that's actually unecessary. It's possible to avoid doing this just by changing libffi, without any kernel changes. I think this has already been done for the iOS port.The trampoline table that has been implemented for the iOS port (MACH) is based on PC-relative data referencing. That is, the code and data are placed in adjacent pages so that the code can access the data using an address relative to the current PC. This is an ISA feature that is not supported on all architectures. Now, if it is a performance feature, we can include some architectures and exclude others. But this is a security feature. IMO, we cannot exclude any architecture even if it is a legacy one as long as Linux is running on the architecture. So, we need a solution that does not assume any specific ISA feature.Which ISA does not support PIC objects? You mentioned i386 below, but i386 does support them, it just needs to copy the PC into a GPR first (see below).quoted
quoted
quoted
The code for trampoline X in the trampoline table is: load &code_table[X], code_reg load (code_reg), code_reg load &data_table[X], data_reg load (data_reg), data_reg jump code_reg The addresses &code_table[X] and &data_table[X] are baked into the trampoline code. So, PC-relative data references are not needed. The user can modify code_table[X] and data_table[X] dynamically.You can put this code into the libffi shared object and map it from there, just like the rest of the libffi code. To get more trampolines, you can map the page containing the trampolines multiple times, each instance preceded by a separate data page with the control information.If you put the code in the libffi shared object, how do you pass data to the code at runtime? If the code we are talking about is a function, then there is an ABI defined way to pass data to the function. But if the code we are talking about is some arbitrary code such as a trampoline, there is no ABI defined way to pass data to it except in a couple of platforms such as HP PA-RISC that have support for function descriptors in the ABI itself. As mentioned before, if the ISA supports PC-relative data references (e.g., X86 64-bit platforms support RIP-relative data references) then we can pass data to that code by placing the code and data in adjacent pages. So, you can implement the trampoline table for X64. i386 does not support it.i386 just needs a tiny bit of code to copy the PC into a GPR first, i.e. the trampoline would be: call 1f 1: pop %data_reg movl (code_table + X - 1b)(%data_reg), %code_reg movl (data_table + X - 1b)(%data_reg), %data_reg jmp *(%code_reg) I do not understand the point about passing data at runtime. This trampoline is to achieve exactly that, no? Thanks.
For libffi, I think the proposed standard trampoline won't actually work, because not all ABIs have two scratch registers available to use as code_reg and data_reg. Eg i386 fastcall only has one, and register has zero scratch registers. I believe 32-bit ARM only has one scratch register as well. For i386 you'd need something that saves a register on the stack first, maybe like the below with a 16-byte trampoline and a 16-byte context structure that has the address of the code to jump to in the first dword: .balign 4096 trampoline_page: .rept 4096/16-1 0: endbr32 push %eax call __x86.get_pc_thunk.ax 1: jmp trampoline .balign 16 .endr .org trampoline_page + 4096 - 16 __x86.get_pc_thunk.ax: movl (%esp), %eax ret trampoline: subl $(1b-0b), %eax jmp *(table-trampoline_page)(%eax) .org trampoline_page + 4096 table: _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel