Re: [PATCH v30 07/12] landlock: Support filesystem access-control

From: Jann Horn <jannh@google.com>
Date: 2021-03-24 03:12:05
Also in: linux-arch, linux-doc, linux-fsdevel, linux-kselftest, linux-security-module, lkml

On Tue, Mar 23, 2021 at 8:22 PM Mickaël Salaün [off-list ref] wrote:

quoted hunk ↗ jump to hunk

On 23/03/2021 18:49, Jann Horn wrote:

quoted

On Tue, Mar 23, 2021 at 4:54 PM Mickaël Salaün [off-list ref] wrote:

quoted

On 23/03/2021 01:13, Jann Horn wrote:

quoted

 On Tue, Mar 16, 2021 at 9:43 PM Mickaël Salaün [off-list ref] wrote:

quoted

Using Landlock objects and ruleset, it is possible to tag inodes
according to a process's domain.

[...]

quoted

+static void release_inode(struct landlock_object *const object)
+       __releases(object->lock)
+{
+       struct inode *const inode = object->underobj;
+       struct super_block *sb;
+
+       if (!inode) {
+               spin_unlock(&object->lock);
+               return;
+       }
+
+       /*
+        * Protects against concurrent use by hook_sb_delete() of the reference
+        * to the underlying inode.
+        */
+       object->underobj = NULL;
+       /*
+        * Makes sure that if the filesystem is concurrently unmounted,
+        * hook_sb_delete() will wait for us to finish iput().
+        */
+       sb = inode->i_sb;
+       atomic_long_inc(&landlock_superblock(sb)->inode_refs);
+       spin_unlock(&object->lock);
+       /*
+        * Because object->underobj was not NULL, hook_sb_delete() and
+        * get_inode_object() guarantee that it is safe to reset
+        * landlock_inode(inode)->object while it is not NULL.  It is therefore
+        * not necessary to lock inode->i_lock.
+        */
+       rcu_assign_pointer(landlock_inode(inode)->object, NULL);
+       /*
+        * Now, new rules can safely be tied to @inode with get_inode_object().
+        */
+
+       iput(inode);
+       if (atomic_long_dec_and_test(&landlock_superblock(sb)->inode_refs))
+               wake_up_var(&landlock_superblock(sb)->inode_refs);
+}

[...]

quoted

+static struct landlock_object *get_inode_object(struct inode *const inode)
+{
+       struct landlock_object *object, *new_object;
+       struct landlock_inode_security *inode_sec = landlock_inode(inode);
+
+       rcu_read_lock();
+retry:
+       object = rcu_dereference(inode_sec->object);
+       if (object) {
+               if (likely(refcount_inc_not_zero(&object->usage))) {
+                       rcu_read_unlock();
+                       return object;
+               }
+               /*
+                * We are racing with release_inode(), the object is going
+                * away.  Wait for release_inode(), then retry.
+                */
+               spin_lock(&object->lock);
+               spin_unlock(&object->lock);
+               goto retry;
+       }
+       rcu_read_unlock();
+
+       /*
+        * If there is no object tied to @inode, then create a new one (without
+        * holding any locks).
+        */
+       new_object = landlock_create_object(&landlock_fs_underops, inode);
+       if (IS_ERR(new_object))
+               return new_object;
+
+       /* Protects against concurrent get_inode_object() calls. */
+       spin_lock(&inode->i_lock);
+       object = rcu_dereference_protected(inode_sec->object,
+                       lockdep_is_held(&inode->i_lock));

rcu_dereference_protected() requires that inode_sec->object is not
concurrently changed, but I think another thread could call
get_inode_object() while we're in landlock_create_object(), and then
we could race with the NULL write in release_inode() here? (It
wouldn't actually be a UAF though because we're not actually accessing
`object` here.) Or am I missing a lock that prevents this?

In v28 this wasn't an issue because release_inode() was holding
inode->i_lock (and object->lock) during the NULL store; but in v29 and
this version the NULL store in release_inode() moved out of the locked
region. I think you could just move the NULL store in release_inode()
back up (and maybe add a comment explaining the locking rules for
landlock_inode(...)->object)?

(Or alternatively you could use rcu_dereference_raw() with a comment
explaining that the read pointer is only used to check for NULL-ness,
and that it is guaranteed that the pointer can't change if it is NULL
and we're holding the lock. But that'd be needlessly complicated, I
think.)

To reach rcu_assign_pointer(landlock_inode(inode)->object, NULL) in
release_inode() or in hook_sb_delete(), the
landlock_inode(inode)->object need to be non-NULL,

Yes.

quoted

which implies that a
call to get_inode_object(inode) either "retry" (because release_inode is
only called by landlock_put_object, which set object->usage to 0) until
it creates a new object, or reuses the existing referenced object (and
increments object->usage).

But it can be that landlock_inode(inode)->object only becomes non-NULL
after get_inode_object() has checked
rcu_dereference(inode_sec->object).

quoted

The worse case would be if
get_inode_object(inode) is called just before the
rcu_assign_pointer(landlock_inode(inode)->object, NULL) from
hook_sb_delete(), which would result in an object with a NULL underobj,
which is the expected behavior (and checked by release_inode).

The scenario I'm talking about doesn't involve hook_sb_delete().

quoted

The line rcu_assign_pointer(inode_sec->object, new_object) from
get_inode_object() can only be reached if the underlying inode doesn't
reference an object,

Yes.

quoted

in which case hook_sb_delete() will not reach the
rcu_assign_pointer(landlock_inode(inode)->object, NULL) line for this
same inode.

This works because get_inode_object(inode) is mutually exclusive to
itself with the same inode (i.e. an inode can only point to an object
that references this same inode).

To clarify: You can concurrently call get_inode_object() multiple
times on the same inode, right? There are no locks held on entry to
that function.

quoted

I tried to explain this with the comment "Protects against concurrent
get_inode_object() calls" in get_inode_object(), and the comments just
before both rcu_assign_pointer(landlock_inode(inode)->object, NULL).

The scenario I'm talking about is:

Initially the inode does not have an associated landlock_object. There
are two threads A and B. Thread A is going to execute
get_inode_object(). Thread B is going to execute get_inode_object()
followed immediately by landlock_put_object().

thread A: enters get_inode_object()
thread A: rcu_dereference(inode_sec->object) returns NULL
thread A: enters landlock_create_object()
thread B: enters get_inode_object()
thread B: rcu_dereference(inode_sec->object) returns NULL
thread B: calls landlock_create_object()
thread B: sets inode_sec->object while holding inode->i_lock
thread B: leaves get_inode_object()
thread B: enters landlock_put_object()
thread B: object->usage drops to 0, object->lock is taken
thread B: calls release_inode()
thread B: drops object->lock
thread A: returns from landlock_create_object()
thread A: takes inode->i_lock

At this point, thread B will run:

    rcu_assign_pointer(landlock_inode(inode)->object, NULL);

while thread A runs:

    rcu_dereference_protected(inode_sec->object,
        lockdep_is_held(&inode->i_lock));

meaning there is a (theoretical) data race, since
rcu_dereference_protected() doesn't use READ_ONCE().

Hum, I see, that is what I was missing. And that explain why there is
(in practice) no impact on winning the race.

I would prefer to use rcu_access_pointer() instead of
rcu_dereference_protected() to avoid pitfall, and it reflects what I was
expecting:

--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c

@@ -117,9 +117,7 @@ static struct landlock_object

*get_inode_object(struct inode *const inode)

        /* Protects against concurrent get_inode_object() calls. */
        spin_lock(&inode->i_lock);
-       object = rcu_dereference_protected(inode_sec->object,
-                       lockdep_is_held(&inode->i_lock));
-       if (unlikely(object)) {
+       if (unlikely(rcu_access_pointer(inode_sec->object))) {
                /* Someone else just created the object, bail out and
retry. */
                spin_unlock(&inode->i_lock);
                kfree(new_object);

Ah, yeah, that should work. I had forgotten about rcu_access_pointer().

quoted hunk ↗ jump to hunk

But I'm not sure about your proposition to move the NULL store in
release_inode() back up. Do you mean to add back the inode lock in
release_inode() like this?

--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c

@@ -59,16 +59,12 @@ static void release_inode(struct landlock_object

*const object)
         * Makes sure that if the filesystem is concurrently unmounted,
         * hook_sb_delete() will wait for us to finish iput().
         */
+       spin_lock(&inode->i_lock);
        sb = inode->i_sb;
        atomic_long_inc(&landlock_superblock(sb)->inode_refs);
        spin_unlock(&object->lock);
-       /*
-        * Because object->underobj was not NULL, hook_sb_delete() and
-        * get_inode_object() guarantee that it is safe to reset
-        * landlock_inode(inode)->object while it is not NULL.  It is therefore
-        * not necessary to lock inode->i_lock.
-        */
        rcu_assign_pointer(landlock_inode(inode)->object, NULL);
+       spin_unlock(&inode->i_lock);
        /*
         * Now, new rules can safely be tied to @inode with get_inode_object().
         */


I would prefer to avoid nested locks if it is not necessary though.

Hm, yeah, you have a point there.

Doing it locklessly does make the locking rules a little complicated
though, and you'll have to update the comment inside struct
landlock_inode_security. At the moment, it says:

* @object: Weak pointer to an allocated object.  All writes (i.e.
* creating a new object or removing one) are protected by the
* underlying inode->i_lock.  Disassociating @object from the inode is
* additionally protected by @object->lock, from the time @object's
* usage refcount drops to zero to the time this pointer is nulled out.

which isn't true anymore.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help