Thread (4 messages) 4 messages, 3 authors, 2020-08-24

Re: IOPRIO_CLASS_RT without CAP_SYS_ADMIN?

From: Khazhismel Kumykov <hidden>
Date: 2020-08-24 20:46:22
Also in: lkml

On Sat, Aug 22, 2020 at 7:14 PM Jens Axboe [off-list ref] wrote:
On 8/22/20 7:58 PM, Bart Van Assche wrote:
quoted
On 2020-08-20 17:35, Khazhismel Kumykov wrote:
quoted
It'd be nice to allow a process to send RT requests without granting
it the wide capabilities of CAP_SYS_ADMIN, and we already have a
capability which seems to almost fit this priority idea -
CAP_SYS_NICE? Would this fit there?

Being capable of setting IO priorities on per request or per thread
basis (be it async submission or w/ thread ioprio_set) is useful
especially when the userspace has its own prioritization/scheduling
before hitting the kernel, allowing us to signal to the kernel how to
order certain IOs, and it'd be nice to separate this from ADMIN for
non-root processes, in a way that's less error prone than e.g. having
a trusted launcher ionice the process and then drop priorities for
everything but prio requests.
Hi Khazhy,

In include/uapi/linux/capability.h I found the following:

/* Allow raising priority and setting priority on other (different
   UID) processes */
/* Allow use of FIFO and round-robin (realtime) scheduling on own
   processes and setting the scheduling algorithm used by another
   process. */
/* Allow setting cpu affinity on other processes */
#define CAP_SYS_NICE         23

If it is acceptable that every process that has permission to submit
IOPRIO_CLASS_RT I/O also has permission to modify the priority of
other processes then extending CAP_SYS_NICE is an option. Another
possibility is to extend the block cgroup controller such that the
capability to submit IOPRIO_CLASS_RT I/O can be enabled through the
cgroup interface. There may be other approaches. I'm not sure what
the best approach is.
I think it fits well with CAP_SYS_NICE, especially since that
capability already grants the ability to demote other processes to
IOPRIO_CLASS_IDLE, etc.
I think CAP_SYS_NICE fits pretty nicely, and I was actually planning on
using that for the io_uring SQPOLL side as well. So there is/will be
some precedent for tying it into IO related things, too. For this use
case, I think it's perfect.

--
Jens Axboe

Attachments

  • smime.p7s [application/pkcs7-signature] 3850 bytes
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help