Thread (8 messages) 8 messages, 4 authors, 2022-07-13

Re: Linux 5.18-rc4

From: John Johansen <john.johansen@canonical.com>
Date: 2022-06-06 19:20:01
Also in: linux-fsdevel, linux-mm, lkml

Possibly related (same subject, not in this thread)

On 6/6/22 11:28, Linus Torvalds wrote:
On Mon, Jun 6, 2022 at 8:19 AM Eric W. Biederman [off-list ref] wrote:
quoted
Has anyone looked into this lock ordering issues?
The deadlock is
quoted
quoted
quoted
[78140.503821]        CPU0                    CPU1
[78140.503823]        ----                    ----
[78140.503824]   lock(&newf->file_lock);
[78140.503826]                                lock(&p->alloc_lock);
[78140.503828]                                lock(&newf->file_lock);
[78140.503830]   lock(&ctx->lock);
and the alloc_lock -> file_lock on CPU1 is trivial - it's seq_show()
in fs/proc/fd.c:

        task_lock(task);
        files = task->files;
        if (files) {
                unsigned int fd = proc_fd(m->private);

                spin_lock(&files->file_lock);

and that looks all normal.

But the other chains look painful.

I do see the IPC code doing ugly things, in particular I detest this code:

        task_lock(current);
        list_add(&shp->shm_clist, &current->sysvshm.shm_clist);
        task_unlock(current);

where it is using the task lock to protect the shm_clist list. Nasty.

And it's doing that inside the shm_ids.rwsem lock _and_ inside the
shp->shm_perm.lock.

So the IPC code has newseg() doing

   shmget ->
    ipcget():
     down_write(ids->rwsem) ->
       newseg():
         ipc_addid gets perm->lock
         task_lock(current)

so you have

  ids->rwsem -> perm->lock -> alloc_lock

there.

So now we have that

   ids->rwsem -> ipcperm->lock -> alloc_lock -> file_lock

when you put those sequences together.

But I didn't figure out what the security subsystem angle is and how
that then apparently mixes things up with execve.

Yes, newseg() is doing that

        error = security_shm_alloc(&shp->shm_perm);

while holding rwsem, but I can't see how that matters. From the
lockdep output, rwsem doesn't actually seem to be part of the whole
sequence.

It *looks* like we have

   apparmour ctx->lock -->
      radix_tree_preloads.lock -->
         ipcperm->lock

and apparently that's called under the file_lock somewhere, completing
the circle.

I guess the execve component is that

  begin_new_exec ->
    security_bprm_committing_creds ->
      apparmor_bprm_committing_creds ->
        aa_inherit_files ->
          iterate_fd ->   *takes file_lock*
            match_file ->
              aa_file_perm ->
                update_file_ctx *takes ctx->lock*

so that's how you get file_lock -> ctx->lock.
yes
So you have:

 SHMGET:
    ipcperm->lock -> alloc_lock
 /proc:
    alloc_lock -> file_lock
 apparmor_bprm_committing_creds:
    file_lock -> ctx->lock

and then all you need is ctx->lock -> ipcperm->lock but I didn't find that part.
yeah that is the part I got stuck on, before being pulled away from this
I suspect that part is that both Apparmor and IPC use the idr local lock.
bingo,

apparmor moved its secids allocation from a custom radix tree to idr in

  99cc45e48678 apparmor: Use an IDR to allocate apparmor secids

and ipc is using the idr for its id allocation as well

I can easily lift the secid() allocation out of the ctx->lock but that
would still leave it happening under the file_lock and not fix the problem.
I think the quick solution would be for apparmor to stop using idr, reverting
back at least temporarily to the custom radix tree.

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help