Thread (7 messages) 7 messages, 2 authors, 2019-08-02

Re: [PATCH 1/1] psi: do not require setsched permission from the trigger creator

From: Suren Baghdasaryan <surenb@google.com>
Date: 2019-08-01 18:28:48
Also in: linux-mm, lkml

Hi Peter,
Thanks for sharing your thoughts. I understand your point and I tend
to agree with it. I originally designed this using watchdog as the
example of a critical system health signal and in the context of
mobile device memory pressure is critical but I agree that there are
more important things in life. I checked and your proposal to change
it to FIFO-1 should still work for our purposes. Will test to make
sure and reply to your patch. Couple clarifications in-line.

On Thu, Aug 1, 2019 at 2:51 AM Peter Zijlstra [off-list ref] wrote:
On Tue, Jul 30, 2019 at 10:44:51AM -0700, Suren Baghdasaryan wrote:
quoted
On Tue, Jul 30, 2019 at 1:11 AM Peter Zijlstra [off-list ref] wrote:
quoted
On Mon, Jul 29, 2019 at 06:33:10PM -0700, Suren Baghdasaryan wrote:
quoted
When a process creates a new trigger by writing into /proc/pressure/*
files, permissions to write such a file should be used to determine whether
the process is allowed to do so or not. Current implementation would also
require such a process to have setsched capability. Setting of psi trigger
thread's scheduling policy is an implementation detail and should not be
exposed to the user level. Remove the permission check by using _nocheck
version of the function.

Suggested-by: Nick Kralevich <redacted>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 kernel/sched/psi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 7acc632c3b82..ed9a1d573cb1 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1061,7 +1061,7 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
                      mutex_unlock(&group->trigger_lock);
                      return ERR_CAST(kworker);
              }
-             sched_setscheduler(kworker->task, SCHED_FIFO, &param);
+             sched_setscheduler_nocheck(kworker->task, SCHED_FIFO, &param);
ARGGH, wtf is there a FIFO-99!! thread here at all !?
We need psi poll_kworker to be an rt-priority thread so that psi
There is a giant difference between 'needs to be higher than OTHER' and
FIFO-99.
quoted
notifications are delivered to the userspace without delay even when
the CPUs are very congested. Otherwise it's easy to delay psi
notifications by running a simple CPU hogger executing "chrt -f 50 dd
if=/dev/zero of=/dev/null". Because these notifications are
So what; at that point that's exactly what you're asking for. Using RT
is for those who know what they're doing.
quoted
time-critical for reacting to memory shortages we can't allow for such
delays.
Furthermore, actual RT programs will have pre-allocated and locked any
memory they rely on. They don't give a crap about your pressure
nonsense.
This signal is used not to protect other RT tasks but to monitor
overall system memory health for the sake of system responsiveness.
quoted
Notice that this kworker is created only if userspace creates a psi
trigger. So unless you are using psi triggers you will never see this
kthread created.
By marking it FIFO-99 you're in effect saying that your stupid
statistics gathering is more important than your life. It will preempt
the task that's in control of the band-saw emergency break, it will
preempt the task that's adjusting the electromagnetic field containing
this plasma flow.

That's insane.
IMHO an opt-in feature stops being "stupid" as soon as the user opted
in to use it, therefore explicitly indicating interest in it. However
I assume you are using "stupid" here to indicate that it's "less
important" rather than it's "useless".
I'm going to queue a patch to reduce this to FIFO-1, that will preempt
regular OTHER tasks but will not perturb (much) actual RT bits.
Thanks for posting the fix.
--
To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help