Thread (4 messages) 4 messages, 2 authors, 2024-11-28

Re: [PATCH] kernel/sys: Optimize do_prlimit lock scope to reduce contention

From: Oleg Nesterov <oleg@redhat.com>
Date: 2024-11-28 07:14:27
Also in: lkml

On 11/27, Andrew Morton wrote:
On Wed, 20 Nov 2024 21:21:56 +0800 Zhen Ni [off-list ref] wrote:
quoted
The security_task_setrlimit function is a Linux Security Module (LSM)
hook that evaluates resource limit changes based on security policies.
It does not alter the rlim data structure, as confirmed by existing
LSM implementations (e.g., SELinux and AppArmor). Thus, this function
does not require locking, ensuring correctness while improving
concurrency.
Seems sane.

Does any code call do_prlimit() frequently enough for this to matter?
I have the same question...
quoted
-	task_lock(tsk->group_leader);
 	if (new_rlim) {
 		/*
 		 * Keep the capable check against init_user_ns until cgroups can
 		 * contain all limits.
 		 */
 		if (new_rlim->rlim_max > rlim->rlim_max &&
-				!capable(CAP_SYS_RESOURCE))
-			retval = -EPERM;
-		if (!retval)
-			retval = security_task_setrlimit(tsk, resource, new_rlim);
+		    !capable(CAP_SYS_RESOURCE))
+			return -EPERM;
+		retval = security_task_setrlimit(tsk, resource, new_rlim);
+		if (retval)
+			return retval;
 	}
+
+	task_lock(tsk->group_leader);
The problem is that task_lock(tsk->group_leader) doesn't look right with or
without this patch. I'll try to make a fix on weekend.

If the caller is sys_prlimit64() and tsk != current, then ->group_leader is
not stable, do_prlimit() can race with mt exec and take the wrong lock.

Oleg.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help