Thread (26 messages) 26 messages, 7 authors, 2017-11-14

Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration

From: Mathieu Desnoyers <hidden>
Date: 2017-11-14 16:48:42
Also in: lkml

----- On Nov 14, 2017, at 11:08 AM, Peter Zijlstra peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org wrote:
On Tue, Nov 14, 2017 at 05:05:41PM +0100, Peter Zijlstra wrote:
quoted
On Tue, Nov 14, 2017 at 03:17:12PM +0000, Mathieu Desnoyers wrote:
quoted
I've tried to create a small single-threaded self-modifying loop in
user-space to trigger a trace cache or speculative execution quirk,
but I have not succeeded yet. I suspect that I would need to know
more about the internals of the processor architecture to create the
right stalls that would allow speculative execution to move further
ahead, and trigger an incoherent execution flow. Ideas on how to
trigger this would be welcome.
I thought the whole problem was per definition multi-threaded.

Single-threaded stuff can't get out of sync with itself; you'll always
observe your own stores.
And even if you could, you can always execute a local serializing
instruction like CPUID to force things.
What I'm trying to reproduce is something that breaks in single-threaded
case if I explicitly leave out the CPUID core serializing instruction
when doing code modification on upcoming code, in a loop.

AFAIU, Intel requires a core serializing instruction to be issued even
in single-threaded scenarios between code update and execution, to ensure
that speculative execution does not observe incoherent code. Now the
question we all have for Intel is: is this requirement too strong, or
required by reality ?

Thanks,

Mathieu
quoted
And ISTR the JIT scenario being something like the JIT overwriting
previously executed but supposedly no longer used code. And in this
scenario you'd want to guarantee all CPUs observe the new code before
jumping into it.

The current approach is using mprotect(), except that on a number of
platforms the TLB invalidate from that is not guaranteed to be strong
enough to sync for code changes.

On x86 the mprotect() should work just fine, since we broadcast IPIs for
the TLB invalidate and the IRET from those will get the things synced up
again (if nothing else; very likely we'll have done a MOV-CR3 which will
of course also have sufficient syncness on it).

But PowerPC, s390, ARM et al that do TLB invalidates without interrupts
and don't guarantee their TLB invalidate sync against execution units
are left broken by this scheme.
-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help