Re: [PATCH v2 4/4] freezer,sched: Rewrite core freezer logic
From: Oleg Nesterov <oleg@redhat.com>
Date: 2021-07-07 14:14:26
Also in:
lkml
sorry for delay... I am still trying to understand this series, just one note for now. On 06/24, Peter Zijlstra wrote:
+static bool __freeze_task(struct task_struct *p)
+{
+ unsigned long flags;
+ unsigned int state;
+ bool frozen = false;
+
+ raw_spin_lock_irqsave(&p->pi_lock, flags);
+ state = READ_ONCE(p->__state);
+ if (state & (TASK_FREEZABLE|__TASK_STOPPED|__TASK_TRACED)) {
+ /*
+ * Only TASK_NORMAL can be augmented with TASK_FREEZABLE,
+ * since they can suffer spurious wakeups.
+ */
+ if (state & TASK_FREEZABLE)
+ WARN_ON_ONCE(!(state & TASK_NORMAL));
+
+#ifdef CONFIG_LOCKDEP
+ /*
+ * It's dangerous to freeze with locks held; there be dragons there.
+ */
+ if (!(state & __TASK_FREEZABLE_UNSAFE))
+ WARN_ON_ONCE(debug_locks && p->lockdep_depth);
+#endif
+
+ if (state & (__TASK_STOPPED|__TASK_TRACED))
+ WRITE_ONCE(p->__state, TASK_FROZEN|__TASK_FROZEN_SPECIAL);Well, this doesn't look right. Firstly, this can race with ptrace_freeze_traced() which can set p->__state = __TASK_TRACED and clear TASK_FROZEN. Or with __set_current_state(TASK_RUNNING) in ptrace_stop(). But the main problem is that you can't simply remove __TASK_TRACED, this can confuse the debugger, any ptrace() request will fail as if the tracee was killed. Another problem. Suppose that p->parent sleeps in do_wait(). p calls ptrace_stop(), sets __TASK_TRACED, and wakes the parent up. __freeze_task() clears __TASK_TRACED. The parent calls wait_task_stopped(p) but it fails because task_is_traced() returns false. The parent sleeps again, and forever because __thaw_special() won't notify it. Or. Suppose that __freeze_task() removes __TASK_STOPPED. The new debugger comes, the tracee should switch from STOPPED to TRACED. But this won't happen because task_is_stopped() in ptrace_() will return false and task_set_jobctl_pending/signal_wake_up_state won't be called. Oleg.