Thread (18 messages) 18 messages, 5 authors, 2025-09-02

Re: [PATCH] tracing: Fix tracing_marker may trigger page fault during preempt_disable

From: Steven Rostedt <rostedt@goodmis.org>
Date: 2025-08-29 12:25:45
Also in: linux-arm-kernel, lkml

[ Adding arm64 maintainers ]

On Fri, 29 Aug 2025 16:29:07 +0800
Luo Gengkun [off-list ref] wrote:
On 2025/8/20 1:50, Steven Rostedt wrote:
quoted
On Tue, 19 Aug 2025 10:51:52 +0000
Luo Gengkun [off-list ref] wrote:
 
quoted
Both tracing_mark_write and tracing_mark_raw_write call
__copy_from_user_inatomic during preempt_disable. But in some case,
__copy_from_user_inatomic may trigger page fault, and will call schedule()
subtly. And if a task is migrated to other cpu, the following warning will  
Wait! What?

__copy_from_user_inatomic() is allowed to be called from in atomic context.
Hence the name it has. How the hell can it sleep? If it does, it's totally
broken!

Now, I'm not against using nofault() as it is better named, but I want to
know why you are suggesting this change. Did you actually trigger a bug here?  
yes, I trigger this bug in arm64.
And I still think this is an arm64 bug.
quoted
 
quoted
be trigger:
         if (RB_WARN_ON(cpu_buffer,
                        !local_read(&cpu_buffer->committing)))

An example can illustrate this issue:

process flow						CPU
---------------------------------------------------------------------

tracing_mark_raw_write():				cpu:0
    ...
    ring_buffer_lock_reserve():				cpu:0
       ...
       cpu = raw_smp_processor_id()			cpu:0
       cpu_buffer = buffer->buffers[cpu]			cpu:0
       ...
    ...
    __copy_from_user_inatomic():				cpu:0
       ...
       # page fault
       do_mem_abort():					cpu:0  
Sounds to me that arm64 __copy_from_user_inatomic() may be broken.
 
quoted
          ...
          # Call schedule
          schedule()					cpu:0
	 ...
    # the task schedule to cpu1
    __buffer_unlock_commit():				cpu:1
       ...
       ring_buffer_unlock_commit():			cpu:1
	 ...
	 cpu = raw_smp_processor_id()			cpu:1
	 cpu_buffer = buffer->buffers[cpu]		cpu:1

As shown above, the process will acquire cpuid twice and the return values
are not the same.

To fix this problem using copy_from_user_nofault instead of
__copy_from_user_inatomic, as the former performs 'access_ok' before
copying.

Fixes: 656c7f0d2d2b ("tracing: Replace kmap with copy_from_user() in trace_marker writing")  
The above commit was intorduced in 2016. copy_from_user_nofault() was
introduced in 2020. I don't think this would be the fix for that kernel.

So no, I'm not taking this patch. If you see __copy_from_user_inatomic()
sleeping, it's users are not the issue. That function is.

-- Steve

 
I noticed that in most places where __copy_from_user_inatomic() is used,
"most" but not all?
it is within the pagefault_disable/enable() section. When pagefault_disable()
is called, user access methods will no sleep. So I'm going to send a v2patch which use pagefault_disable/enable()to fix this problem. -- Gengkun
No, I don't want that either. __copy_from_user_inatomic() SHOULD NOT SLEEP!
If it does, than it is a bug!

If it can sleep, "inatomic" is a very bad name. The point of being
"inatomic" is that you are in a location that IS NOT ALLOWED TO SLEEP!

I don't want to fix a symptom and leave a bug around.

BTW, the reason not to fault is because this might be called in code that is
already doing a fault and could cause deadlocks. The no sleeping part is a
side effect.

-- Steve
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help