Thread (49 messages) 49 messages, 4 authors, 2021-03-25

Re: [PATCH v30 07/12] landlock: Support filesystem access-control

From: Jann Horn <jannh@google.com>
Date: 2021-03-24 03:12:05
Also in: linux-arch, linux-doc, linux-fsdevel, linux-kselftest, linux-security-module, lkml

On Tue, Mar 23, 2021 at 8:22 PM Mickaël Salaün [off-list ref] wrote:
quoted hunk ↗ jump to hunk
On 23/03/2021 18:49, Jann Horn wrote:
quoted
On Tue, Mar 23, 2021 at 4:54 PM Mickaël Salaün [off-list ref] wrote:
quoted
On 23/03/2021 01:13, Jann Horn wrote:
quoted
 On Tue, Mar 16, 2021 at 9:43 PM Mickaël Salaün [off-list ref] wrote:
quoted
Using Landlock objects and ruleset, it is possible to tag inodes
according to a process's domain.
[...]
quoted
+static void release_inode(struct landlock_object *const object)
+       __releases(object->lock)
+{
+       struct inode *const inode = object->underobj;
+       struct super_block *sb;
+
+       if (!inode) {
+               spin_unlock(&object->lock);
+               return;
+       }
+
+       /*
+        * Protects against concurrent use by hook_sb_delete() of the reference
+        * to the underlying inode.
+        */
+       object->underobj = NULL;
+       /*
+        * Makes sure that if the filesystem is concurrently unmounted,
+        * hook_sb_delete() will wait for us to finish iput().
+        */
+       sb = inode->i_sb;
+       atomic_long_inc(&landlock_superblock(sb)->inode_refs);
+       spin_unlock(&object->lock);
+       /*
+        * Because object->underobj was not NULL, hook_sb_delete() and
+        * get_inode_object() guarantee that it is safe to reset
+        * landlock_inode(inode)->object while it is not NULL.  It is therefore
+        * not necessary to lock inode->i_lock.
+        */
+       rcu_assign_pointer(landlock_inode(inode)->object, NULL);
+       /*
+        * Now, new rules can safely be tied to @inode with get_inode_object().
+        */
+
+       iput(inode);
+       if (atomic_long_dec_and_test(&landlock_superblock(sb)->inode_refs))
+               wake_up_var(&landlock_superblock(sb)->inode_refs);
+}
[...]
quoted
+static struct landlock_object *get_inode_object(struct inode *const inode)
+{
+       struct landlock_object *object, *new_object;
+       struct landlock_inode_security *inode_sec = landlock_inode(inode);
+
+       rcu_read_lock();
+retry:
+       object = rcu_dereference(inode_sec->object);
+       if (object) {
+               if (likely(refcount_inc_not_zero(&object->usage))) {
+                       rcu_read_unlock();
+                       return object;
+               }
+               /*
+                * We are racing with release_inode(), the object is going
+                * away.  Wait for release_inode(), then retry.
+                */
+               spin_lock(&object->lock);
+               spin_unlock(&object->lock);
+               goto retry;
+       }
+       rcu_read_unlock();
+
+       /*
+        * If there is no object tied to @inode, then create a new one (without
+        * holding any locks).
+        */
+       new_object = landlock_create_object(&landlock_fs_underops, inode);
+       if (IS_ERR(new_object))
+               return new_object;
+
+       /* Protects against concurrent get_inode_object() calls. */
+       spin_lock(&inode->i_lock);
+       object = rcu_dereference_protected(inode_sec->object,
+                       lockdep_is_held(&inode->i_lock));
rcu_dereference_protected() requires that inode_sec->object is not
concurrently changed, but I think another thread could call
get_inode_object() while we're in landlock_create_object(), and then
we could race with the NULL write in release_inode() here? (It
wouldn't actually be a UAF though because we're not actually accessing
`object` here.) Or am I missing a lock that prevents this?

In v28 this wasn't an issue because release_inode() was holding
inode->i_lock (and object->lock) during the NULL store; but in v29 and
this version the NULL store in release_inode() moved out of the locked
region. I think you could just move the NULL store in release_inode()
back up (and maybe add a comment explaining the locking rules for
landlock_inode(...)->object)?

(Or alternatively you could use rcu_dereference_raw() with a comment
explaining that the read pointer is only used to check for NULL-ness,
and that it is guaranteed that the pointer can't change if it is NULL
and we're holding the lock. But that'd be needlessly complicated, I
think.)
To reach rcu_assign_pointer(landlock_inode(inode)->object, NULL) in
release_inode() or in hook_sb_delete(), the
landlock_inode(inode)->object need to be non-NULL,
Yes.
quoted
which implies that a
call to get_inode_object(inode) either "retry" (because release_inode is
only called by landlock_put_object, which set object->usage to 0) until
it creates a new object, or reuses the existing referenced object (and
increments object->usage).
But it can be that landlock_inode(inode)->object only becomes non-NULL
after get_inode_object() has checked
rcu_dereference(inode_sec->object).
quoted
The worse case would be if
get_inode_object(inode) is called just before the
rcu_assign_pointer(landlock_inode(inode)->object, NULL) from
hook_sb_delete(), which would result in an object with a NULL underobj,
which is the expected behavior (and checked by release_inode).
The scenario I'm talking about doesn't involve hook_sb_delete().
quoted
The line rcu_assign_pointer(inode_sec->object, new_object) from
get_inode_object() can only be reached if the underlying inode doesn't
reference an object,
Yes.
quoted
in which case hook_sb_delete() will not reach the
rcu_assign_pointer(landlock_inode(inode)->object, NULL) line for this
same inode.

This works because get_inode_object(inode) is mutually exclusive to
itself with the same inode (i.e. an inode can only point to an object
that references this same inode).
To clarify: You can concurrently call get_inode_object() multiple
times on the same inode, right? There are no locks held on entry to
that function.
quoted
I tried to explain this with the comment "Protects against concurrent
get_inode_object() calls" in get_inode_object(), and the comments just
before both rcu_assign_pointer(landlock_inode(inode)->object, NULL).
The scenario I'm talking about is:

Initially the inode does not have an associated landlock_object. There
are two threads A and B. Thread A is going to execute
get_inode_object(). Thread B is going to execute get_inode_object()
followed immediately by landlock_put_object().

thread A: enters get_inode_object()
thread A: rcu_dereference(inode_sec->object) returns NULL
thread A: enters landlock_create_object()
thread B: enters get_inode_object()
thread B: rcu_dereference(inode_sec->object) returns NULL
thread B: calls landlock_create_object()
thread B: sets inode_sec->object while holding inode->i_lock
thread B: leaves get_inode_object()
thread B: enters landlock_put_object()
thread B: object->usage drops to 0, object->lock is taken
thread B: calls release_inode()
thread B: drops object->lock
thread A: returns from landlock_create_object()
thread A: takes inode->i_lock

At this point, thread B will run:

    rcu_assign_pointer(landlock_inode(inode)->object, NULL);

while thread A runs:

    rcu_dereference_protected(inode_sec->object,
        lockdep_is_held(&inode->i_lock));

meaning there is a (theoretical) data race, since
rcu_dereference_protected() doesn't use READ_ONCE().
Hum, I see, that is what I was missing. And that explain why there is
(in practice) no impact on winning the race.

I would prefer to use rcu_access_pointer() instead of
rcu_dereference_protected() to avoid pitfall, and it reflects what I was
expecting:
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -117,9 +117,7 @@ static struct landlock_object
*get_inode_object(struct inode *const inode)

        /* Protects against concurrent get_inode_object() calls. */
        spin_lock(&inode->i_lock);
-       object = rcu_dereference_protected(inode_sec->object,
-                       lockdep_is_held(&inode->i_lock));
-       if (unlikely(object)) {
+       if (unlikely(rcu_access_pointer(inode_sec->object))) {
                /* Someone else just created the object, bail out and
retry. */
                spin_unlock(&inode->i_lock);
                kfree(new_object);
Ah, yeah, that should work. I had forgotten about rcu_access_pointer().
quoted hunk ↗ jump to hunk
But I'm not sure about your proposition to move the NULL store in
release_inode() back up. Do you mean to add back the inode lock in
release_inode() like this?
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -59,16 +59,12 @@ static void release_inode(struct landlock_object
*const object)
         * Makes sure that if the filesystem is concurrently unmounted,
         * hook_sb_delete() will wait for us to finish iput().
         */
+       spin_lock(&inode->i_lock);
        sb = inode->i_sb;
        atomic_long_inc(&landlock_superblock(sb)->inode_refs);
        spin_unlock(&object->lock);
-       /*
-        * Because object->underobj was not NULL, hook_sb_delete() and
-        * get_inode_object() guarantee that it is safe to reset
-        * landlock_inode(inode)->object while it is not NULL.  It is therefore
-        * not necessary to lock inode->i_lock.
-        */
        rcu_assign_pointer(landlock_inode(inode)->object, NULL);
+       spin_unlock(&inode->i_lock);
        /*
         * Now, new rules can safely be tied to @inode with get_inode_object().
         */


I would prefer to avoid nested locks if it is not necessary though.
Hm, yeah, you have a point there.

Doing it locklessly does make the locking rules a little complicated
though, and you'll have to update the comment inside struct
landlock_inode_security. At the moment, it says:

* @object: Weak pointer to an allocated object.  All writes (i.e.
* creating a new object or removing one) are protected by the
* underlying inode->i_lock.  Disassociating @object from the inode is
* additionally protected by @object->lock, from the time @object's
* usage refcount drops to zero to the time this pointer is nulled out.

which isn't true anymore.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help