[PATCH v3 6/8] x86: Split syscall_trace_enter into two phases
From: luto@amacapital.net (Andy Lutomirski)
Date: 2014-07-28 20:23:38
Also in:
linux-arch, linux-mips, lkml
On Mon, Jul 28, 2014 at 10:37 AM, Oleg Nesterov [off-list ref] wrote:
Hi Andy, I am really sorry for delay. This is on top of the recent change from Kees, right? Could me remind me where can I found the tree this series based on? So that I could actually apply these changes...
https://git.kernel.org/cgit/linux/kernel/git/kees/linux.git/log/?h=seccomp/fastpath The first four patches are already applied there.
On 07/21, Andy Lutomirski wrote:quoted
+long syscall_trace_enter_phase2(struct pt_regs *regs, u32 arch, + unsigned long phase1_result) { long ret = 0; + u32 work = ACCESS_ONCE(current_thread_info()->flags) & + _TIF_WORK_SYSCALL_ENTRY; + + BUG_ON(regs != task_pt_regs(current)); user_exit();@@ -1458,17 +1562,20 @@ long syscall_trace_enter(struct pt_regs *regs) * do_debug() and we need to set it again to restore the user * state. If we entered on the slow path, TF was already set. */ - if (test_thread_flag(TIF_SINGLESTEP)) + if (work & _TIF_SINGLESTEP) regs->flags |= X86_EFLAGS_TF;This looks suspicious, but perhaps I misread this change. If I understand correctly, syscall_trace_enter() can avoid _phase2() above. But we should always call user_exit() unconditionally?
Damnit. I read that every function called by user_exit, and none of them give any indication of why they're needed for traced syscalls but not for untraced syscalls. On a second look, it seems that TIF_NOHZ controls it. I'll update the code to call user_exit iff TIF_NOHZ is set. If that's still wrong, then I don't see how the current code is correct either.
And we should always set X86_EFLAGS_TF if TIF_SINGLESTEP? IIRC, TF can be actually cleared on a 32bit kernel if we step over sysenter insn?
I don't follow. If TIF_SINGLESTEP, then phase1 will return a nonzero value, and phase2 will set TF. I admit I don't really understand all the TF machinations. --Andy