Re: apparmor NULL pointer dereference on resume [efivarfs]

From: Ard Biesheuvel <ardb@kernel.org>
Date: 2025-03-11 07:16:47
Also in: linux-efi, linux-fsdevel

(cc Al Viro)

On Mon, 10 Mar 2025 at 22:49, James Bottomley
[off-list ref] wrote:

On Mon, 2025-03-10 at 12:57 -0700, Ryan Lee wrote:

quoted

On Wed, Mar 5, 2025 at 1:47 PM Malte Schröder
[off-list ref] wrote:

quoted

On 05/03/2025 20:22, Ryan Lee wrote:

quoted

On Wed, Mar 5, 2025 at 11:11 AM Malte Schröder
[off-list ref] wrote:

quoted

Hi,

I hope this is the right place to report this. Since 6.14-rc1
ff. resume
from hibernate does not work anymore. Now I finally managed to
get dmesg
from when this happens (Console is frozen, but managed to login
via
network). If I read that trace correctly there seems to be some
interaction with apparmor. I retried with apparmor disabled and
the
issue didn't trigger.

Also CC'ing the AppArmor-specific mailing list in this reply.

quoted

I am happy to provide more data if required.

Could you try to reproduce this NULL pointer dereference with a
clean
kernel with debug info (that I'd be able to access the source
code of)
and send a symbolized stacktrace processed with
scripts/decode_stacktrace.sh?

Sure. Result using plain v6.14-rc5:

[  142.014428] BUG: kernel NULL pointer dereference, address:
0000000000000018
[  142.014429] #PF: supervisor read access in kernel mode
[  142.014431] #PF: error_code(0x0000) - not-present page
[  142.014432] PGD 0 P4D 0
[  142.014433] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[  142.014436] CPU: 4 UID: 0 PID: 6833 Comm: systemd-sleep Not
tainted
6.14.0-rc5 #1
[  142.014437] Hardware name: To Be Filled By O.E.M. X570
Extreme4/X570
Extreme4, BIOS P5.60 01/18/2024
[  142.014439] RIP: 0010:apparmor_file_open
(./include/linux/mount.h:78
(discriminator 2) ./include/linux/fs.h:2781 (discriminator 2)
security/apparmor/lsm.c:483 (discriminator 2))
[ 142.014442] Code: c5 00 08 00 00 0f 85 4b 01 00 00 4c 89 e9 31 c0
f6
c1 02 0f 85 fd 00 00 00 48 8b 87 88 00 00 00 4c 8d b7 88 00 00 00
48 89
fd <48> 8b 40 18 48 8b 4f 70 0f b7 11 48 89 c7 66 89 54 24 04 48 8b
51
All code
========
   0:    c5 00 08                 (bad)
   3:    00 00                    add    %al,(%rax)
   5:    0f 85 4b 01 00 00        jne    0x156
   b:    4c 89 e9                 mov    %r13,%rcx
   e:    31 c0                    xor    %eax,%eax
  10:    f6 c1 02                 test   $0x2,%cl
  13:    0f 85 fd 00 00 00        jne    0x116
  19:    48 8b 87 88 00 00 00     mov    0x88(%rdi),%rax
  20:    4c 8d b7 88 00 00 00     lea    0x88(%rdi),%r14
  27:    48 89 fd                 mov    %rdi,%rbp
  2a:*    48 8b 40 18              mov    0x18(%rax),%rax        <-
-
trapping instruction
  2e:    48 8b 4f 70              mov    0x70(%rdi),%rcx
  32:    0f b7 11                 movzwl (%rcx),%edx
  35:    48 89 c7                 mov    %rax,%rdi
  38:    66 89 54 24 04           mov    %dx,0x4(%rsp)
  3d:    48                       rex.W
  3e:    8b                       .byte 0x8b
  3f:    51                       push   %rcx

Code starting with the faulting instruction
===========================================
   0:    48 8b 40 18              mov    0x18(%rax),%rax
   4:    48 8b 4f 70              mov    0x70(%rdi),%rcx
   8:    0f b7 11                 movzwl (%rcx),%edx
   b:    48 89 c7                 mov    %rax,%rdi
   e:    66 89 54 24 04           mov    %dx,0x4(%rsp)
  13:    48                       rex.W
  14:    8b                       .byte 0x8b
  15:    51                       push   %rcx
[  142.014443] RSP: 0018:ffffb9ef7189bc50 EFLAGS: 00010246
[  142.014445] RAX: 0000000000000000 RBX: ffff95eb5e555b00 RCX:
0000000000000300
[  142.014446] RDX: ffff95f838227538 RSI: 00000000002ba677 RDI:
ffff95e992be2a00
[  142.014447] RBP: ffff95e992be2a00 R08: ffff95f838227520 R09:
0000000000000002
[  142.014447] R10: ffff95ea72241d00 R11: 0000000000000001 R12:
0000000000000010
[  142.014448] R13: 0000000000000300 R14: ffff95e992be2a88 R15:
ffff95e95a6034e0
[  142.014449] FS:  00007f74ab6cf880(0000)
GS:ffff95f838200000(0000)
knlGS:0000000000000000
[  142.014450] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  142.014451] CR2: 0000000000000018 CR3: 00000002473b6000 CR4:
0000000000f50ef0
[  142.014452] PKRU: 55555554
[  142.014453] Call Trace:
[  142.014454]  <TASK>
[  142.014456] ? __die_body (arch/x86/kernel/dumpstack.c:421)
[  142.014459] ? page_fault_oops (arch/x86/mm/fault.c:710)
[  142.014460] ? __lock_acquire (kernel/locking/lockdep.c:?
kernel/locking/lockdep.c:5174)
[  142.014462] ? local_lock_acquire
(./include/linux/local_lock_internal.h:29 (discriminator 1))
[  142.014465] ? do_user_addr_fault (arch/x86/mm/fault.c:?)
[  142.014467] ? exc_page_fault
(./arch/x86/include/asm/irqflags.h:37
./arch/x86/include/asm/irqflags.h:92 arch/x86/mm/fault.c:1488
arch/x86/mm/fault.c:1538)
[  142.014468] ? asm_exc_page_fault
(./arch/x86/include/asm/idtentry.h:623)
[  142.014471] ? apparmor_file_open (./include/linux/mount.h:78
(discriminator 2) ./include/linux/fs.h:2781 (discriminator 2)
security/apparmor/lsm.c:483 (discriminator 2))
[  142.014472] security_file_open (security/security.c:?)
[  142.014474] do_dentry_open (fs/open.c:934)
[  142.014476] kernel_file_open (fs/open.c:1201)
[  142.014477] efivarfs_pm_notify (fs/efivarfs/super.c:505)

I traced the NULL dereference down to efivarfs_pm_notify creating a
struct path with a NULL .mnt pointer which is then passed into
kernel_file_open, which then invokes the LSM file_open security hook,
where AppArmor is not expecting a path that has a NULL .mnt pointer.
The code in question was introduced in b5d1e6ee761a (efivarfs: add
variable resync after hibernation).

I have sent in a patch to the AppArmor mailing list at
https://lists.ubuntu.com/archives/apparmor/2025-March/013545.html
which should give improved diagnostics for this case happening again.
My understanding is that path .mnt pointers generally should not be
NULL, but I do not know what an appropriate (non-NULL) value for that
pointer should be, as I am not familiar with the efivarfs subsystem.

The problem comes down to the superblock functions not being able to
get the struct vfsmount for the superblock (because it isn't even
allocated until after they've all been called).  The assumption I was
operating under was that provided I added O_NOATIME to prevent the
parent directory being updated, passing in a NULL mnt for the purposes
of iterating the directory dentry was safe.  What apparmour is trying
to do is look up the idmap for the mount point to do one of its checks.

There are two ways of fixing this that I can think of.  One would be
exporting a function that lets me dig the vfsmount out of s_mounts and
use that (it's well hidden in the internals of fs/mount.h, so I suspect
this might not be very acceptable) or to get mnt_idmap to return
&nop_mnt_idmap if the passed in mnt is NULL.  I'd lean towards the
latter, but I'm cc'ing fsdevel to see what others think.


Al spotted the same issue based on a syzbot report [0]

[0] https://lore.kernel.org/all/20250310235831.GL2023217@ZenIV/ (local)

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help