Thread (40 messages) 40 messages, 2 authors, 1d ago

Re: [PATCH v16 20/20] unwind_user/sframe: Add prctl() interface for registering .sframe sections

From: sashiko-bot@kernel.org
Date: 2026-05-21 15:23:35

Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:

New issues:
- [High] The patch adds a test-only UAPI interface that is explicitly not intended for mainline inclusion, violating ABI stability rules.
- [High] Unbounded memory allocation in PR_ADD_SFRAME allows a local Denial of Service (DoS) via memory exhaustion.
- [Low] When SFrame support is disabled, the prctl options incorrectly return -ENOSYS instead of -EINVAL.

Pre-existing issues:
- [Critical] Improper handling of -EFAULT in sframe_find() leads to premature deletion of valid unwinding metadata and poses a severe NMI deadlock risk.
- [High] A malicious user application can reliably trigger a WARN_ON_ONCE in sframe_find() by causing concurrent removals of an .sframe section, potentially panicking the system.
- [High] Unbounded iteration in sframe_validate_section() without yielding the CPU can cause softlockup DoS.
--

commit f2ba64c060979156a7ef9c545cd667594369670a
Author: Josh Poimboeuf [off-list ref]

unwind_user/sframe: Add prctl() interface for registering .sframe sections

The kernel doesn't have direct visibility to the ELF contents of shared
libraries. This commit adds prctl() interfaces to allow glibc to tell the
kernel where to find .sframe sections for unwinding.
[
  This adds an interface for prctl() for testing loading of sframes for
  libraries. But this interface should really be a system call. This patch
  is for testing purposes only and should not be applied to mainline.
]
[Severity: High]
Since this is explicitly marked as a temporary test patch, should it be
redesigned as a proper system call or removed before being merged?

Adding temporary or experimental prctl() interfaces exposes a temporary ABI to
userspace which must be maintained indefinitely, polluting the UAPI.
quoted hunk ↗ jump to hunk
diff --git a/kernel/sys.c b/kernel/sys.c
index 62e842055cc9c..b0a9b1e3ccd7c 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
[ ... ]
quoted hunk ↗ jump to hunk
@@ -2907,6 +2908,14 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 		if (arg3 & PR_CFI_LOCK && !(arg3 & PR_CFI_DISABLE))
 			error = arch_prctl_lock_branch_landing_pad_state(me);
 		break;
+	case PR_ADD_SFRAME:
+		error = sframe_add_section(arg2, arg3, arg4, arg5);
[Severity: High]
Could this allow a local Denial of Service via memory exhaustion?

A user process can repeatedly call prctl(PR_ADD_SFRAME) with valid, small
ranges. If each call allocates a new struct sframe_section and inserts it
into the mm->sframe_mt maple tree using GFP_KERNEL_ACCOUNT, an unprivileged
user might exhaust kernel memory since there is no upper limit on the total
number of sections a process can register.

[Severity: Low]
Is it intentional that the fallback stub for sframe_add_section() returns
-ENOSYS when SFrame support is disabled?

include/linux/sframe.h:sframe_add_section() {
    ...
    return -ENOSYS;
}

Standard kernel conventions dictate that unsupported or disabled prctl options
should return -EINVAL. Returning -ENOSYS incorrectly indicates to userspace
that the entire prctl syscall is unimplemented.

[Severity: High]
This is a pre-existing issue, but is there a risk of a softlockup Denial of
Service during section validation?

When registering a new .sframe section via PR_ADD_SFRAME, the kernel
synchronously validates all FDE and FRE entries in sframe_validate_section().
Because the number of FDEs is read from user-space memory and can be extremely
large, an attacker could provide a massive amount of FDEs/FREs.

Since the validation loop does not contain cond_resched(), the kernel could
spin here for seconds without yielding, triggering the softlockup detector.
+		break;
+	case PR_REMOVE_SFRAME:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		error = sframe_remove_section(arg2);
[Severity: Critical]
This is a pre-existing issue, but does handling -EFAULT in sframe_find() risk
a severe NMI deadlock and premature deletion of valid metadata?

If unsafe_get_user() cleanly faults because the memory is temporarily paged
out, it sets ret = -EFAULT. The cleanup block then calls
WARN_ON_ONCE(sframe_remove_section(sec->sframe_start)).

Because sframe_find() is used by stack unwinders (like perf) that execute
in NMI context, triggering sframe_remove_section() inside an NMI attempts
to acquire the maple tree's internal spinlock (mt_lock). If the NMI
interrupted a thread already holding mt_lock, this could result in a hard
deadlock. Furthermore, this permanently destroys perfectly valid unwinding
metadata simply because a page was not resident.

[Severity: High]
This is a pre-existing issue, but might a malicious application reliably
trigger the WARN_ON_ONCE in sframe_find(), potentially panicking the system?

If an .sframe section returns -EFAULT because it is paged out, the error
path calls WARN_ON_ONCE(sframe_remove_section(sec->sframe_start)).
If sframe_remove_section() fails to find the section, it returns -EINVAL,
causing the warning to fire.

This condition could be reliably triggered if multiple threads concurrently
unwind through the same paged-out section, or if one thread unwinds while
another explicitly removes the section using the newly added
prctl(PR_REMOVE_SFRAME). If panic_on_warn is enabled, this might allow
unprivileged users to reliably crash the system.
+		break;
 	default:
 		trace_task_prctl_unknown(option, arg2, arg3, arg4, arg5);
 		error = -EINVAL;
-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260521142546.3908498-1-jremus@linux.ibm.com?part=20
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help