Thread (63 messages) 63 messages, 6 authors, 2021-12-13

Re: [PATCH v5 13/16] ima: Move some IMA policy and filesystem related variables into ima_namespace

From: James Bottomley <hidden>
Date: 2021-12-10 14:22:25
Also in: linux-integrity, lkml

On Fri, 2021-12-10 at 08:57 -0500, Stefan Berger wrote:
On 12/10/21 06:32, Christian Brauner wrote:
quoted
On Thu, Dec 09, 2021 at 07:57:02PM -0500, Stefan Berger wrote:
quoted
On 12/9/21 14:11, Christian Brauner wrote:
quoted
  From 1f03dc427c583d5e9ebc9ebe9de77c3c535bbebe Mon Sep 17
00:00:00 2001
From: Christian Brauner <redacted>
Date: Thu, 9 Dec 2021 20:07:02 +0100
Subject: [PATCH] !!!! HERE BE DRAGONS - UNTESTED !!!!

---
   security/integrity/ima/ima_fs.c | 43
+++++++++++++++++++++++++++++----
   1 file changed, 38 insertions(+), 5 deletions(-)
diff --git a/security/integrity/ima/ima_fs.c
b/security/integrity/ima/ima_fs.c
index 583462b29cb5..d5b302b925b8 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -317,10 +317,14 @@ static ssize_t ima_read_policy(char
*path)
   static ssize_t ima_write_policy(struct file *file, const
char __user *buf,
   				size_t datalen, loff_t *ppos)
   {
-	struct ima_namespace *ns = get_current_ns();
+	struct ima_namespace *ns;
+	struct user_namespace *user_ns;
   	char *data;
   	ssize_t result;
+	user_ns = ima_filp_private(filp);
+	ns = user_ns->ima_ns
+
   	if (datalen >= PAGE_SIZE)
   		datalen = PAGE_SIZE - 1;
@@ -373,26 +377,51 @@ static const struct seq_operations
ima_policy_seqops = {
   };
   #endif
+static struct user_namespace *ima_filp_private(struct file
*filp)
+{
+	if (!(filp->f_flags & O_WRONLY)) {
+#ifdef CONFIG_IMA_READ_POLICY
+		struct seq_file *seq;
+
+		seq = filp->private_data;
+		return seq->private;
+#endif
+	}
+	return filp->private_data;
+}
+
   /*
    * ima_open_policy: sequentialize access to the policy file
    */
   static int ima_open_policy(struct inode *inode, struct file
*filp)
   {
-	struct ima_namespace *ns = get_current_ns();
+	struct user_namespace *user_ns = current_user_ns();
Do we have to take a reference on the user namespace assuming one
can open
the file, pass the fd down the hierarchy, and then the user
namespace with
the opened file goes away? Or is there anything else that keeps
the user
namespace alive?
No, we don't. When ima_policy_open() is called we do
current_user_ns() but that will be guaranteed to be identical to
filp->f_cred->user_ns. And f_cred is a reference that has been
taken when the vfs allocated a struct file for this .open call so
won't go away until the last fput.

My proposal is also too complicated, I think.
(The booster is giving me the same side-effects as my second shot
so this looks like two good days of fever and headache. So I'll use
that as an excuse. :))

Your patch series as it stands has a bit of a security issue with
those get_current_ns() calls across differnet file/seq_file
operations. 
You have to make an architectural decision, I think. I see two
sensible options:
1. The relevant ima_ns that .open/.read/.write operate on is always
taken to be the ima_ns of the filesystem's userns, i.e. sb-
quoted
s_user_ns->ima_ns.
    This - but I'm not an ima person - makes the most sense to me
and the semantics are straightforward. If I write to a file to
alter some policy then I expect the ima namespace of the user
namespace to be affected that the securityfs instance was mounted
in.
2. The relevant ima_ns that .open/.read/.write operate on is always
taken to be the one of the opener. I don't really like that as that
gets weird if for some complicated reason the caller is not located
in the userns the filesystem was mounted in (weird mount
propagation scenario or sm). It also feels strange to operate on an
ima_ns that's different from s_user_ns->ima_ns in a securityfs
instance.
We have this situation because one can setns() to another mount 
namespaces but the data shown by SecurityFS lives in a user
namespace,  right?
Well, not necessarily.  There is another case where only the userns is
unshared and securityfs is never mounted inside the container.  If the
process has the capability to open the securityfs files (kubernetes
privileged container, say), what should it see? The analogue with the
pid namespace says it should see the contents of the what the parent
had mounted because if it wanted to see its own it would have done a
mount of securityfs inside the userns.  This argues for sb->s_user_ns-
ima_ns.
for the setns mount namespace case, the vfsmnt tree is duplicated, so
if the securityfs sb->s_user_ns is your user namespace in the prior
mount namespace, it will end up being so in the new one.  sb->s_user_ns 
only changes on actual mount.
 And now we need to decide whether to affect the data in the
user namespace  that did the open (option 2) or to which the
SecurityFS  belongs to (option 1). If we were to open a regular file
it would be option 1, so we should probably not break that existing
semantic and also choose option 1 unless there one wasn't allowed to
choose the user namespace the SecurityFS files belonged to then it
should be option 2 
Once the userns is unshared, IMA accounting is done inside the
namespace.  However, in order to see the results, the container must
mount securityfs in the userns.  I can't think of a good reason why a
privileged container should want to be accounted separately but see the
results of its parents, but similarly I can't see why a pid namespace
should want to see /proc of its parent either ... yet that's the
semantic we have today.
but then we have file descriptor passing where 'being allowed' can 
change depending on who is reading/writing a file... Is there
anything that would prevent us from setns()'ing to that target user
namespace so that we would now see that of a user namespace that we
are not allowed to see?
If you're able to setns to a user namespace, you logically have all its
privileges, so that problem shouldn't arise.

Option 2 is basically sliding back towards securityfs magically
changing properties depending on which userns is asking.  If we're
going to support that, I don't see what was wrong with the owner/guid
magically changing as well like I first propsed.  If we're going to
insist on a new mount of securityfs, I think it has to function cleanly
like the pid namespace, so option 1 is required.

James

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help