[PATCH v6 6/9] seccomp: add "seccomp" syscall
From: luto@amacapital.net (Andy Lutomirski)
Date: 2014-06-13 21:42:27
Also in:
linux-api, linux-arch, linux-mips, lkml
On Fri, Jun 13, 2014 at 2:37 PM, Alexei Starovoitov [off-list ref] wrote:
On Fri, Jun 13, 2014 at 2:25 PM, Andy Lutomirski [off-list ref] wrote:quoted
On Fri, Jun 13, 2014 at 2:22 PM, Alexei Starovoitov [off-list ref] wrote:quoted
On Tue, Jun 10, 2014 at 8:25 PM, Kees Cook [off-list ref] wrote:quoted
This adds the new "seccomp" syscall with both an "operation" and "flags" parameter for future expansion. The third argument is a pointer value, used with the SECCOMP_SET_MODE_FILTER operation. Currently, flags must be 0. This is functionally equivalent to prctl(PR_SET_SECCOMP, ...). Signed-off-by: Kees Cook <redacted> Cc: linux-api at vger.kernel.org --- arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl | 1 + include/linux/syscalls.h | 2 ++ include/uapi/asm-generic/unistd.h | 4 ++- include/uapi/linux/seccomp.h | 4 +++ kernel/seccomp.c | 63 ++++++++++++++++++++++++++++++++----- kernel/sys_ni.c | 3 ++ 7 files changed, 69 insertions(+), 9 deletions(-)diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl index d6b867921612..7527eac24122 100644 --- a/arch/x86/syscalls/syscall_32.tbl +++ b/arch/x86/syscalls/syscall_32.tbl@@ -360,3 +360,4 @@ 351 i386 sched_setattr sys_sched_setattr 352 i386 sched_getattr sys_sched_getattr 353 i386 renameat2 sys_renameat2 +354 i386 seccomp sys_seccompdiff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl index ec255a1646d2..16272a6c12b7 100644 --- a/arch/x86/syscalls/syscall_64.tbl +++ b/arch/x86/syscalls/syscall_64.tbl@@ -323,6 +323,7 @@ 314 common sched_setattr sys_sched_setattr 315 common sched_getattr sys_sched_getattr 316 common renameat2 sys_renameat2 +317 common seccomp sys_seccomp # # x32-specific system call numbers start at 512 to avoid cache impactdiff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index b0881a0ed322..1713977ee26f 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h@@ -866,4 +866,6 @@ asmlinkage long sys_process_vm_writev(pid_t pid, asmlinkage long sys_kcmp(pid_t pid1, pid_t pid2, int type, unsigned long idx1, unsigned long idx2); asmlinkage long sys_finit_module(int fd, const char __user *uargs, int flags); +asmlinkage long sys_seccomp(unsigned int op, unsigned int flags, + const char __user *uargs);It looks odd to add 'flags' argument to syscall that is not even used. It don't think it will be extensible this way. 'uargs' is used only in 2nd command as well and it's not 'char __user *' but rather 'struct sock_fprog __user *' I think it makes more sense to define only first argument as 'int op' and the rest as variable length array. Something like: long sys_seccomp(unsigned int op, struct nlattr *attrs, int len); then different commands can interpret 'attrs' differently. if op == mode_strict, then attrs == NULL, len == 0 if op == mode_filter, then attrs->nla_type == seccomp_bpf_filter and nla_data(attrs) is 'struct sock_fprog'Eww. If the operation doesn't imply the type, then I think we've totally screwed up.quoted
If we decide to add new types of filters or new commands, the syscall prototype won't need to change. New commands can be added preserving backward compatibility. The basic TLV concept has been around forever in netlink world. imo makes sense to use it with new syscalls. Passing 'struct xxx' into syscalls is the thing of the past. TLV style is more extensible. Fields of structures can become optional in the future, new fields added, etc. 'struct nlattr' brings the same benefits to kernel api as protobuf did to user land.I see no reason to bring nl_attr into this. Admittedly, I've never dealt with nl_attr, but everything netlink-related I've even been involved in has involved some sort of API atrocity.netlink has a lot of legacy and there is genetlink which is not pretty either because of extra socket creation, binding, dealing with packet loss issues, but the key concept of variable length encoding is sound. Right now seccomp has two commands and they already don't fit into single syscall neatly. Are you saying there should be two syscalls here? What about another seccomp related command? Another syscall? imo all seccomp related commands needs to be mux/demux-ed under one syscall. What is the way to mux/demux potentially very different commands under one syscall? I cannot think of anything better than TLV style. 'struct nlattr' is what we have today and I think it works fine. I'm not suggesting to bring the whole netlink into the picture, but rather TLV style of encoding different arguments for different commands.
I'm unconvinced. These are simple commands, and I think the interface should be simple. Syscalls are cheap. As an example, the interface could be: int seccomp_add_filter(const struct sock_fprog *filter, unsigned int flags); The "tsync" operation would be seccomp_add_filter(NULL, SECCOMP_ADD_FILTER_TSYNC) -- it's equivalent to adding an always-accept filter and syncing threads. But, frankly, this kind of stuff should probably be "do operation X". IIUC nl_attr is more like "do something, with these tags and values", which results in oddities like whatever should happen of more than one tag is set. --Andy -- Andy Lutomirski AMA Capital Management, LLC