Thread (20 messages) 20 messages, 4 authors, 2021-02-03

Re: kprobes broken since 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()")

From: Steven Rostedt <rostedt@goodmis.org>
Date: 2021-02-02 17:00:15
Also in: lkml

Possibly related (same subject, not in this thread)

On Tue, 2 Feb 2021 17:45:47 +0100
Peter Zijlstra [off-list ref] wrote:
On Tue, Feb 02, 2021 at 09:52:49AM -0500, Steven Rostedt wrote:
quoted
But from a handler, you could do:

	if (in_nmi())
		return;
	local_irq_save(flags);
	/* Now you are safe from being re-entrant. */  
But that's an utter crap thing to do. That's like saying I don't care
about my events, at which point you might as well not bother at all.

And you can still do that, you just get less coverage today than you
used to. You used to throw things under the bus, now you throw more
under the bus. If you didn't care, I can't seem to find myself caring
either.
NMIs are special, and they always have been. They shouldn't be doing much
anyway. If they are, then that's a problem.

But if you want to make the stack tracer work on all contexts, I'm happy to
take patches. I don't have time to work on it today.
quoted
Where as there's no equivalent in a NMI handler. That's what makes
kprobe/ftrace handlers different than NMI handlers.  
I don't see how.
quoted
quoted
Also, given how everything can nest, it had better all be lockless
anyway. You can get your regular function trace interrupted, which can
hit a #DB, which can function trace, which can #BP which can function
trace again which can get #NMI etc.. Many wonderfun nestings possible.  
I would call #DB an #BP handlers very special.  
They are, just like NMI is special, which is why they're classed
together.
quoted
Question: Do #DB and #BP set "in_interrupt()"? Because the function tracer
has infrastructure to prevent recursion in the same context.  
Sure we _could_ do that, but then we get into the 'fun' problem of
getting a breakpoint/int3 at random places and calling random code and
having deadlocks because they take the same lock.
My question wasn't to have them do it, I was simply asking if they do. I
was assuming that they do not.
There was very little that stopped that from happening.
quoted
That is, a
ftrace handler calls something that gets traced, the recursion protection
will detect that and prevent the handler from being called again. But the
recursion protection is interrupt context aware and lets the handler get
called again if the recursion happens from a different context:  
quoted
If #DB and #BP do not change the in_interrupt() context, then the above
still will protect the ftrace handlers from recursion due to them.  
But it doesn't help with:

	spin_lock_irq(&foo); // task context
	#DB
	  spin_lock_irq(&foo); // interrupt context per your above
The statement above said:

 "If #DB and #BP do not change the in_interrupt() context"

Which would make the above be in the same context and the handler would
not be called for the #DB case.
All you need to do is put a breakpoint on a piece of code that holds a
spinlock and a handler that takes the same spinlock.

There was very little from stopping that.
quoted
That would require refactoring all the code that's been around since 2008.  
Because I couldn't tell why/if any of that was correct at all. #DB/#BP
don't play by the normal rules. They're _far_ more NMI-like than they're
IRQ-like due to ignoring IF.
I'm fine with #DB and #BP being a "in_nmi()", as they are probably even
more special than NMIs.

-- Steve
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help