Re: possible deadlock in start_this_handle (2)
From: Dmitry Vyukov <dvyukov@google.com>
Date: 2021-02-13 10:59:52
Also in:
linux-ext4, lkml
On Fri, Feb 12, 2021 at 4:43 PM Michal Hocko [off-list ref] wrote:
On Fri 12-02-21 21:58:15, Tetsuo Handa wrote:quoted
On 2021/02/12 21:30, Michal Hocko wrote:quoted
On Fri 12-02-21 12:22:07, Matthew Wilcox wrote:quoted
On Fri, Feb 12, 2021 at 08:18:11PM +0900, Tetsuo Handa wrote:quoted
On 2021/02/12 1:41, Michal Hocko wrote:quoted
But I suspect we have drifted away from the original issue. I thought that a simple check would help us narrow down this particular case and somebody messing up from the IRQ context didn't sound like a completely off.From my experience at https://lkml.kernel.org/r/201409192053.IHJ35462.JLOMOSOFFVtQFH@I-love.SAKURA.ne.jp , I think we can replace direct PF_* manipulation with macros which do not receive "struct task_struct *" argument. Since TASK_PFA_TEST()/TASK_PFA_SET()/TASK_PFA_CLEAR() are for manipulating PFA_* flags on a remote thread, we can define similar ones for manipulating PF_* flags on current thread. Then, auditing dangerous users becomes easier.No, nobody is manipulating another task's GFP flags.Agreed. And nobody should be manipulating PF flags on remote tasks either.No. You are misunderstanding. The bug report above is an example of manipulating PF flags on remote tasks.The bug report you are referring to is ancient. And the cpuset code doesn't touch task->flags for a long time. I haven't checked exactly but it is years since regular and atomic flags have been separated unless I misremember.quoted
You say "nobody should", but the reality is "there indeed was". There might be unnoticed others. The point of this proposal is to make it possible to "find such unnoticed users who are manipulating PF flags on remote tasks".I am really confused what you are proposing here TBH and referring to an ancient bug doesn't really help. task->flags are _explicitly_ documented to be only used for _current_. Is it possible that somebody writes a buggy code? Sure, should we build a whole infrastructure around that to catch such a broken code? I am not really sure. One bug 6 years ago doesn't sound like a good reason for that.
Another similar one was just reported: https://syzkaller.appspot.com/bug?extid=1b2c6989ec12e467d65c WARNING: possible circular locking dependency detected 5.11.0-rc7-syzkaller #0 Not tainted ------------------------------------------------------ kswapd0/2232 is trying to acquire lock: ffff88801f552650 (sb_internal){.+.+}-{0:0}, at: evict+0x2ed/0x6b0 fs/inode.c:577 but task is already holding lock: ffffffff8be89240 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (fs_reclaim){+.+.}-{0:0}: __fs_reclaim_acquire mm/page_alloc.c:4326 [inline] fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340 might_alloc include/linux/sched/mm.h:193 [inline] slab_pre_alloc_hook mm/slab.h:493 [inline] slab_alloc_node mm/slab.c:3221 [inline] kmem_cache_alloc_node_trace+0x48/0x520 mm/slab.c:3596 __do_kmalloc_node mm/slab.c:3618 [inline] __kmalloc_node+0x38/0x60 mm/slab.c:3626 kmalloc_node include/linux/slab.h:575 [inline] kvmalloc_node+0x61/0xf0 mm/util.c:587 kvmalloc include/linux/mm.h:781 [inline] ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline] ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline] ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649 ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224 ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380 ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493 __vfs_setxattr+0x10e/0x170 fs/xattr.c:177 __vfs_setxattr_noperm+0x11a/0x4c0 fs/xattr.c:208 __vfs_setxattr_locked+0x1bf/0x250 fs/xattr.c:266 vfs_setxattr+0x135/0x320 fs/xattr.c:291 setxattr+0x1ff/0x290 fs/xattr.c:553 path_setxattr+0x170/0x190 fs/xattr.c:572 __do_sys_setxattr fs/xattr.c:587 [inline] __se_sys_setxattr fs/xattr.c:583 [inline] __x64_sys_setxattr+0xc0/0x160 fs/xattr.c:583 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46