On 2025-08-26 07:03:33 [-0700], Sean Christopherson wrote:
And the call from __vhost_worker_flush() is done while holding a vhost_worker.mutex.
That's probably ok? But there are many paths that lead to __vhost_worker_flush(),
which makes it difficult to audit all flows. So even if there is an easy change
for the RCU conflict, I wouldn't be comfortable adding a mutex_lock() to so many
flows in a patch that needs to go to stable@.
If I may throw something else into the mix: If you do "early"
get_task_struct() on the thread (within the thread), then you could wake
it even after its do_exit() since the task_struct would remain valid.
Once you remove it from all structs where it can be found, you would do
the final put_task_struct().
Sebastian