Re: [PATCH] tracing: Fix tracing_marker may trigger page fault during preempt_disable

From: Catalin Marinas <catalin.marinas@arm.com>
Date: 2025-09-01 12:28:05
Also in: linux-arm-kernel, lkml

On Mon, Sep 01, 2025 at 10:56:47AM +0100, Mark Rutland wrote:

On Sat, Aug 30, 2025 at 11:22:51AM +0100, Catalin Marinas wrote:

quoted

On Fri, Aug 29, 2025 at 06:13:11PM -0400, Steven Rostedt wrote:

quoted

On Fri, 29 Aug 2025 20:53:40 +0100
Catalin Marinas [off-list ref] wrote:
valid user address.

quoted

BTW, arm64 also bails out early in do_page_fault() if in_atomic() but I
suspect that's not the case here.

Adding Al Viro since since he wrote a large part of uaccess.h.

So, __copy_from_user_inatomic() is supposed to be called if
pagefault_disable() has already been called? If this is the case, can we
add more comments to this code? I've been using the inatomic() version this
way in preempt disabled locations since 2016.

This should work as long as in_atomic() returns true as it's checked in
the arm64 fault code. With PREEMPT_NONE, however, I don't think this
works.

Sorry, what exactly breaks for the PREEMPT_NONE case?

This code would trigger a warning:

	preempt_disable();
	WARN_ON(!in_atomic());
	preempt_enable();

More importantly, a faulting __copy_from_user_inatomic() between
get/put_cpu() could trigger migration.

quoted

I just wanted to figure out why __copy_from_user_inatomic() wasn't atomic.
If anything, it needs to be better documented.

Yeah, I had no idea until I looked at the code. I guess it means it can
be safely used if in_atomic() == true (well, making it up, not sure what
the intention was).

I think that was the intention -- it's the caller asserting that they
know the access won't fault (and hence won't sleep), and that's why
__copy_to_user_inatomic() and __copy_to_user() only differ by the latter
calling might_sleep().

It looks like other inconsistencies have crept in by accident. AFAICT
the should_fail_usercopy() check in __copy_from_user() was accidentally
missed from __copy_from_user_inatomic() when it was introduced in
commit:

I was wondering about that but some code comment for the inatomic
variant states that it's the responsibility of the caller to ensure it
doesn't fault. Not sure one can do other than pinning the page _and_
taking the mm lock. So I agree we should add the fault injection here as
well.

... so there's a bunch of scope for cleanup, and we could probably have:

	/*
	 * TODO: comment saying to only call this directly when you know
	 * that the fault handler won't sleep.
	 */
	static __always_inline __must_check unsigned long
	__copy_from_user_inatomic(void *to, const void __user *from, unsigned long n)
	{
		...
	}

	static __always_inline __must_check unsigned long
	__copy_from_user(void *to, const void __user *from, unsigned long n)
	{
		might_fault();
		return __copy_from_user_inatomic();
	}

... to avoid the inconsistency.

I think the _inatomic variant should be reserved to arch code that knows
the conditions. Generic code/drivers may not necessarily be aware of
what the arch fault handler does. The _nofault API I think is better
suited in generic code.

-- 
Catalin

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help