Thread (87 messages) 87 messages, 12 authors, 2022-01-17

Re: [RFC PATCH 11/13] x86/uintr: Introduce uintr_wait() syscall

From: Thomas Gleixner <hidden>
Date: 2021-10-01 21:29:16
Also in: linux-api, linux-kselftest, lkml

On Fri, Oct 01 2021 at 08:13, Andy Lutomirski wrote:
On Fri, Oct 1, 2021, at 2:56 AM, Thomas Gleixner wrote:
quoted
On Thu, Sep 30 2021 at 21:41, Andy Lutomirski wrote:
quoted
On Thu, Sep 30, 2021, at 5:01 PM, Thomas Gleixner wrote:
quoted
Now that I read the docs some more, I'm seriously concerned about this
XSAVE design.  XSAVES with UINTR is destructive -- it clears UINV.  If
we actually use this, then the whole last_cpu "preserve the state in
registers" optimization goes out the window.  So does anything that
happens to assume that merely saving the state doesn't destroy it on
respectable modern CPUs XRSTORS will #GP if you XRSTORS twice, which
makes me nervous and would need a serious audit of our XRSTORS paths.
I have no idea what you are fantasizing about. You can XRSTORS five
times in a row as long as your XSTATE memory image is correct.
I'm just reading TFM, which is some kind of dystopian fantasy.

11.8.2.4 XRSTORS

Before restoring the user-interrupt state component, XRSTORS verifies
that UINV is 0. If it is not, XRSTORS causes a general-protection
fault (#GP) before loading any part of the user-interrupt state
component. (UINV is IA32_UINTR_MISC[39:32]; XRSTORS does not check the
contents of the remainder of that MSR.)
Duh. I was staring at the SDM and searching for a hint. Stupid me!
So if UINV is set in the memory image and you XRSTORS five times in a
row, the first one will work assuming UINV was zero.  The second one
will #GP.
Yes. I can see what you mean now :)
11.8.2.3 XSAVES
After saving the user-interrupt state component, XSAVES clears UINV. (UINV is IA32_UINTR_MISC[39:32];
XSAVES does not modify the remainder of that MSR.)

So if we're running a UPID-enabled user task and we switch to a kernel
thread, we do XSAVES and UINV is cleared.  Then we switch back to the
same task and don't do XRSTORS (or otherwise write IA32_UINTR_MISC)
and UINV is still clear.
Yes, that has to be mopped up on the way to user space.
And we had better clear UINV when running a kernel thread because the
UPID might get freed or the kernel thread might do some CPL3
shenanigans (via EFI, perhaps? I don't know if any firmwares actually
do this).
Right. That's what happens already with the current pile.
So all this seems to put UINV into the "independent" category of
feature along with LBR.  And the 512-byte wastes from extra copies of
the legacy area and the loss of the XMODIFIED optimization will just
be collateral damage.
So we'd end up with two XSAVES on context switch. We can simply do:

        XSAVES();
        fpu.state.xtsate.uintr.uinv = 0;

which allows to do as many XRSTORS in a row as we want. Only the final
one on the way to user space will have to restore the real vector if the
register state is not valid:

       if (fpu_state_valid()) {
            if (needs_uinv(current)
               wrmsrl(UINV, vector);
       } else {
            if (needs_uinv(current)
               fpu.state.xtsate.uintr.uinv = vector;
            XRSTORS();
       }

Hmm?

Thanks,

        tglx
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help