Thread (73 messages) 73 messages, 8 authors, 2020-07-27

Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions

From: Ira Weiny <ira.weiny@intel.com>
Date: 2020-07-24 19:44:01
Also in: linux-fsdevel, linux-kselftest, linux-mm, lkml, nvdimm

On Fri, Jul 24, 2020 at 10:29:23AM -0700, Andy Lutomirski wrote:
quoted
On Jul 24, 2020, at 10:23 AM, Ira Weiny [off-list ref] wrote:

On Thu, Jul 23, 2020 at 10:15:17PM +0200, Thomas Gleixner wrote:
quoted
Thomas Gleixner [off-list ref] writes:
quoted
Ira Weiny [off-list ref] writes:
quoted
On Fri, Jul 17, 2020 at 12:06:10PM +0200, Peter Zijlstra wrote:
quoted
quoted
On Fri, Jul 17, 2020 at 12:20:56AM -0700, ira.weiny@intel.com wrote:
I've been really digging into this today and I'm very concerned that I'm
completely missing something WRT idtentry_enter() and idtentry_exit().

I've instrumented idt_{save,restore}_pkrs(), and __dev_access_{en,dis}able()
with trace_printk()'s.

With this debug code, I have found an instance where it seems like
idtentry_enter() is called without a corresponding idtentry_exit().  This has
left the thread ref counter at 0 which results in very bad things happening
when __dev_access_disable() is called and the ref count goes negative.

Effectively this seems to be happening:

...
   // ref == 0
   dev_access_enable()  // ref += 1 ==> disable protection
       // exception  (which one I don't know)
           idtentry_enter()
               // ref = 0
               _handler() // or whatever code...
           // *_exit() not called [at least there is no trace_printk() output]...
           // Regardless of trace output, the ref is left at 0
   dev_access_disable() // ref -= 1 ==> -1 ==> does not enable protection
   (Bad stuff is bound to happen now...)
Well, if any exception which calls idtentry_enter() would return without
going through idtentry_exit() then lots of bad stuff would happen even
without your patches.
quoted
Also is there any chance that the process could be getting scheduled and that
is causing an issue?
Only from #PF, but after the fault has been resolved and the tasks is
scheduled in again then the task returns through idtentry_exit() to the
place where it took the fault. That's not guaranteed to be on the same
CPU. If schedule is not aware of the fact that the exception turned off
stuff then you surely get into trouble. So you really want to store it
in the task itself then the context switch code can actually see the
state and act accordingly.
Actually thats nasty as well as you need a stack of PKRS values to
handle nested exceptions. But it might be still the most reasonable
thing to do. 7 PKRS values plus an index should be really sufficient,
that's 32bytes total, not that bad.
I've thought about this a bit more and unless I'm wrong I think the
idtentry_state provides for that because each nested exception has it's own
idtentry_state doesn't it?
Only the ones that use idtentry_enter() instead of, say, nmi_enter().
Oh agreed...

But with this patch we are still better off than just preserving during context
switch.

I need to update the commit message here to make this clear though.

Thanks,
Ira
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help