Thread (14 messages) 14 messages, 4 authors, 2020-09-27

Re: [PATCH v2 0/4] [RFC] Implement Trampoline File Descriptor

From: Madhavan T. Venkataraman <hidden>
Date: 2020-09-23 23:51:50
Also in: linux-api, linux-arm-kernel, linux-fsdevel, linux-integrity, lkml

Possibly related (same subject, not in this thread)


On 9/23/20 2:51 PM, Arvind Sankar wrote:
On Wed, Sep 23, 2020 at 02:17:30PM -0500, Madhavan T. Venkataraman wrote:
quoted

On 9/23/20 4:11 AM, Arvind Sankar wrote:
quoted
For libffi, I think the proposed standard trampoline won't actually
work, because not all ABIs have two scratch registers available to use
as code_reg and data_reg. Eg i386 fastcall only has one, and register
has zero scratch registers. I believe 32-bit ARM only has one scratch
register as well.
The trampoline is invoked as a function call in the libffi case. Any
caller saved register can be used as code_reg, can it not? And the
scratch register is needed only to jump to the code. After that, it
can be reused for any other purpose.

However, for ARM, you are quite correct. There is only one scratch
register. This means that I have to provide two types of trampolines:

	- If an architecture has enough scratch registers, use the currently
	  defined trampoline.

	- If the architecture has only one scratch register, but has PC-relative
	  data references, then embed the code address at the bottom of the
	  trampoline and access it using PC-relative addressing.

Thanks for pointing this out.

Madhavan
libffi is trying to provide closures with non-standard ABIs as well: the
actual user function is standard ABI, but the closure can be called with
a different ABI. If the closure was created with FFI_REGISTER abi, there
are no registers available for the trampoline to use: EAX, EDX and ECX
contain the first three arguments of the function, and every other
register is callee-save.

I provided a sample of the kind of trampoline that would be needed in
this case -- it's position-independent and doesn't clobber any registers
at all, and you get 255 trampolines per page. If I take another 16-byte
slot out of the page for the end trampoline that does the actual work,
I'm sure I could even come up with one that can just call a normal C
function, only the return might need special handling depending on the
return type.

And again, do you actually have any example of an architecture that
cannot run position-independent code? PC-relative addressing is an
implementation detail: the fact that it's available for x86_64 but not
for i386 just makes position-independent code more cumbersome on i386,
but it doesn't make it impossible. For the tiny trampolines here, it
makes almost no difference.
Hi Arvind,

I am preparing a response for all of your comments. I will send it out
tomorrow. Sorry for the delay.

Madhavan
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help