Re: [RFC 0/4] Landlock: ioctl support
From: Mickaël Salaün <mic@digikod.net>
Date: 2023-06-19 18:57:50
Also in:
linux-fsdevel
On 19/06/2023 18:21, Günther Noack wrote:
Hello Mickaël! On Sat, Jun 17, 2023 at 11:47:55AM +0200, Mickaël Salaün wrote:quoted
This subject is not easy, but I think we're reaching a consensus (see my 3-steps proposal plan below). I answered your questions about the (complex) interface I proposed, but we should focus on the first step now (your initial proposal) and get back to the other steps later in another email thread.Thanks for the review!quoted
On 10/05/2023 21:21, Günther Noack wrote:quoted
[...] Some specific things I don't understand well are: [...]Thanks, this all make sense now. 👍quoted
quoted
Would it not be a more orthogonal API if the "file selection" part of the Landlock API and the "policy adding" part for these selected files were independent of each other? Then the .device and .file_type selection scheme could be used for the existing policies as well?Both approaches have pros and cons. I propose a new incremental approach below that starts with the simple case where there is no direct links between different rule types (only the third step add that).quoted
* When restricting by dev_t (major and minor number), aren't there use cases where a system might have 256 CDROM drives, and you'd need to allow-list all of these minor number combinations?Indeed, we should be able to just ignore device minors.They device numbers are listed in https://www.kernel.org/doc/Documentation/admin-guide/devices.txt Some major numbers are grab-bags of miscellaneous devices, for example major numbers 10 and 13; 13/0-31 (maj/min) are joysticks, whereas 13/32-62) are mice. Maybe this can be specified as ranges on dev_t, so that it would be possible to specify 13/32-62, without matching the joysticks too?
Indeed, being able to specifying a range would be useful. I'm not sure about the whole dev_t range or only a minor range though. From a semantic POV, there is no links between majors.
quoted
quoted
I think that it might be a feasible approach to start with the LANDLOCK_ACCESS_FS_IOCTL approach and then look at its usage to understand whether we see a significant number of programs whose sandboxes are too coarse because of this. If more fine-granular control is needed, we can still put the other approach on top, and the additional complexity from LANDLOCK_ACCESS_FS_IOCTL that we have to support is not that dramatically high. [...]I agree that IOCTLs are a security risk and that we should propose a simple solution short-term, and maybe a more complete one long-term. The main issue with a unique IOCTL access right for a file hierarchy is that we may not know which device/driver will be the target/server, and hence if we need to allow some IOCTL for regular files (e.g., fscrypt), we might end up allowing all IOCTLs. Here is a plan to incrementally develop a fine-grained IOCTL access control in 3 steps: 1/ Add a simple IOCTL access right for path_beneath: what you initially proposed. For systems that already configure nodev mount points, it could be even more useful (e.g., safely allow IOCTL on /home for fscrypt, and specific /dev files otherwise).Ack, I'll continue on that implementation then. 👍quoted
2/ Create a new type of rule to identify file/device type: struct landlock_inode_type_attr { __u64 allowed_access; /* same as for path_beneath */ __u64 flags; /* bit 0: ignores device minor */ dev_t device; /* same as stat's st_rdev */ __u16 file_type; /* same as stat's st_mode & S_IFMT */ }; We'll probably need to differentiate the handled accesses for path_beneath rules and those for inode_type rules to be more useful. One issue with this type of rule is that it could be used as an oracle to bypass stat restrictions. We could check if such (virtual) action is allowed without the current domain though. 3/ Add a new type of rule to match IOCTL commands, with a mechanism to tie this to inode_type rules (because a command ID is relative to a file type/device), and potentially the same mechanism to tie inode_type rules to path_beneath rules. Each of this step can be implemented one after the other, and each one is valuable. What do you think?I think it is a good idea to do it in multiple steps, as I also believe that step 1) already provides value on its own. To make sure we are on the same page, let me paraphrase my understanding here: 1) is what I already sent as RFC, with tests and documentation etc. With 1), callers could allow and deny ioctls on newly opened files, independent of file path.
Yes
2) makes dev_t and file_type predicates which can be used to limit
file accesses on already opened files.
With 2), callers could allow and deny ioctls and other operations
also on files which are already opened before enablement, such
as the TTY FD inherited from the parent process.This is not on already opened files, it would be a complementary rule type to path_beneath ones: if a rule with dev_t/file_type matches and allowed action FOO, and a path_beneath rules matches and also allowed action FOO, then FOO is allowed. However, this could be extended in the future (with a new dedicated flag) to also support already-opened files.
3) would make it possible to restrict individual ioctl commands,
depending on the dev_t, the file_type, and possibly the path.Yes. I think we could even only extend the "landlock_inode_attr" struct to add an ioctl_command field, and potentially others too. I think it would make sense because IOCTLs are tied to a specific device/file type (or even the underlying filesystem). In fact, we can split the third step in two: 3/ Extend landlock_inode_attr with ioctl_command. 4/ Define an inode_category field that contain an arbitrary number which can be match against a path_beneath rule (with a bitmask to be able to match others too). Other steps could allow to match landlock_inode_attr with FDs, add a landlock_fd_range_attr…
Is that accurate? In 2) and 3), I'm still contemplating a bit whether we have alternative approaches, but I believe in any case, it's clear that they can be built on top of each other without much overhead, and for many programs, outright denying ioctl on newly opened files with 1) might be the most straightforward approach which already delivers value.
Right
Some notes which I collected on the side: * I'm still worried about the already-opened file descriptors like the TTY, through which it can be easy to escape a sandbox (c.f. https://nvd.nist.gov/vuln/detail/CVE-2017-5226) - I would like to have a solution for that. Step (2) would fix it, but maybe I can find a workaround to at least detect the problem in step (1) already.
Yes, good example, Landlock should be able to help with this issue thanks to the third step.
* I looked at OpenBSD's pledge and unveil. In OpenBSD, the ioctl policy is based on the properties of an already opened file. They have hardcoded their ioctl logic in the kernel, which would be a mistake to do in Linux due to stronger backwards compatibility guarantees. OpenBSD is making the allow/deny decisions based on device major/minor numbers and file type (usually block/char device). I think it is noteworthy that OpenBSD does *not* use the file path to make that decision.
Yes, we could have the same behavior by only using the rule type added with the second step, but we could easily improve that by combining a path_beneath rule thanks to the third step.
I wonder whether we should replicate that in steps (2) and (3). It would make for a simpler, more orthogonal API, where the ioctl policy would become a property of the task, and it would be enforced independently of the existing path-based logic. When callers combine that with the overall ioctl check from (1), it might be flexible enough for practical purposes.
Yes, this will be possible with the four steps.
* We should not rely on the way that ioctl request numbers are structured, because it is used inconsistently. More background: https://lwn.net/Articles/428140/ (Thank you for pointing out this article, Jeff!)
Definitely, but the user space libraries might help with that. ;)
quoted
quoted
quoted
quoted
* Should we introduce a "landlock_fd_rights_limit()" syscall?[...] We should also think about batch operations on FD (see the close_range syscall), for instance to deny all IOCTLs on inherited or received FDs.Hm, you mean a landlock_fd_rights_limit_range() syscall to limit the rights for an entire range of FDs? I have to admit, I'm not familiar with the real-life use cases of close_range(). In most programs I work with, it's difficult to reason about their ordering once the program has really started to run. So I imagine that close_range() is mostly used to "sanitize" the open file descriptors at the start of main(), and you have a similar use case in mind for this one as well? If it's just about closing the range from 0 to 2, I'm not sure it's worth adding a custom syscall. :)The advantage of this kind of range is to efficiently manage all potential FDs, and the main use case is to close (or change, see the flags) everything *except" 0-2 (i.e. 3-~0), and then avoid a lot of (potentially useless) syscalls. The Landlock interface doesn't need to be a syscall. We could just add a new rule type which could take a FD range and restrict them when calling landlock_restrict_self(). Something like this: struct landlock_fd_attr { __u64 allowed_access; __u32 first; __u32 last; }Ack, I see it a bit better. Let's discuss this topic separately. —Günther