Re: [PATCH 1/8] lockdep: allow to disable reclaim lockup detection

[PATCH 0/8 v3] scope GFP_NOFS api · Michal Hocko <mhocko@kernel.org> · 2017-01-06
[PATCH 1/8] lockdep: allow to disable reclaim lockup detection · Michal Hocko <mhocko@kernel.org> · 2017-01-06
Re: [PATCH 1/8] lockdep: allow to disable reclaim lockup detection · Vlastimil Babka <hidden> · 2017-01-09
[PATCH 2/8] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS · Michal Hocko <mhocko@kernel.org> · 2017-01-06
Re: [PATCH 2/8] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS · Vlastimil Babka <hidden> · 2017-01-09
Re: [PATCH 2/8] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS · Michal Hocko <mhocko@kernel.org> · 2017-01-09
Re: [PATCH 2/8] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS · Darrick J. Wong <hidden> · 2017-01-09
[PATCH 3/8] mm: introduce memalloc_nofs_{save,restore} API · Michal Hocko <mhocko@kernel.org> · 2017-01-06
Re: [PATCH 3/8] mm: introduce memalloc_nofs_{save,restore} API · Vlastimil Babka <hidden> · 2017-01-09
Re: [PATCH 3/8] mm: introduce memalloc_nofs_{save,restore} API · Michal Hocko <mhocko@kernel.org> · 2017-01-09
Re: [PATCH 3/8] mm: introduce memalloc_nofs_{save,restore} API · Michal Hocko <mhocko@kernel.org> · 2017-01-09
Re: [PATCH 3/8] mm: introduce memalloc_nofs_{save,restore} API · Vlastimil Babka <hidden> · 2017-01-09
[PATCH 4/8] xfs: use memalloc_nofs_{save,restore} instead of memalloc_noio* · Michal Hocko <mhocko@kernel.org> · 2017-01-06
Re: [PATCH 4/8] xfs: use memalloc_nofs_{save,restore} instead of memalloc_noio* · Vlastimil Babka <hidden> · 2017-01-09
Re: [PATCH 4/8] xfs: use memalloc_nofs_{save,restore} instead of memalloc_noio* · Michal Hocko <mhocko@kernel.org> · 2017-01-09
Re: [PATCH 4/8] xfs: use memalloc_nofs_{save,restore} instead of memalloc_noio* · Brian Foster <hidden> · 2017-01-09
Re: [PATCH 4/8] xfs: use memalloc_nofs_{save,restore} instead of memalloc_noio* · Darrick J. Wong <hidden> · 2017-01-09
[PATCH 5/8] jbd2: mark the transaction context with the scope GFP_NOFS context · Michal Hocko <mhocko@kernel.org> · 2017-01-06
[PATCH 6/8] jbd2: make the whole kjournald2 kthread NOFS safe · Michal Hocko <mhocko@kernel.org> · 2017-01-06
[PATCH 7/8] Revert "ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp" · Michal Hocko <mhocko@kernel.org> · 2017-01-06
Re: [PATCH 7/8] Revert "ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp" · Theodore Ts'o <tytso@mit.edu> · 2017-01-17
Re: [PATCH 7/8] Revert "ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp" · Michal Hocko <mhocko@kernel.org> · 2017-01-17
Re: [PATCH 7/8] Revert "ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp" · Michal Hocko <mhocko@kernel.org> · 2017-03-06
[PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Michal Hocko <mhocko@kernel.org> · 2017-01-06
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Theodore Ts'o <tytso@mit.edu> · 2017-01-17
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Michal Hocko <mhocko@kernel.org> · 2017-01-17
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Michal Hocko <mhocko@kernel.org> · 2017-01-17
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Theodore Ts'o <tytso@mit.edu> · 2017-01-17
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Michal Hocko <mhocko@kernel.org> · 2017-01-17
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Jan Kara <jack@suse.cz> · 2017-01-17
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Michal Hocko <mhocko@kernel.org> · 2017-01-19
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Jan Kara <jack@suse.cz> · 2017-01-19
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Michal Hocko <mhocko@kernel.org> · 2017-01-19
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Michal Hocko <mhocko@kernel.org> · 2017-01-26
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Theodore Ts'o <tytso@mit.edu> · 2017-01-27
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Michal Hocko <mhocko@kernel.org> · 2017-01-27
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Theodore Ts'o <tytso@mit.edu> · 2017-01-27
Re: [Cluster-devel] [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Christoph Hellwig <hch@infradead.org> · 2017-01-28
Re: [Cluster-devel] [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · David Lang <hidden> · 2017-01-28
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Michal Hocko <mhocko@kernel.org> · 2017-01-30
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Michal Hocko <mhocko@kernel.org> · 2017-02-03
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Andreas Dilger <hidden> · 2017-01-17
Re: [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction" · Michal Hocko <mhocko@kernel.org> · 2017-01-18
[DEBUG PATCH 0/2] debug explicit GFP_NO{FS,IO} usage from the scope context · Michal Hocko <mhocko@kernel.org> · 2017-01-06
[DEBUG PATCH 1/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context · Michal Hocko <mhocko@kernel.org> · 2017-01-06
[DEBUG PATCH 2/2] silent warnings which we cannot do anything about · Michal Hocko <mhocko@kernel.org> · 2017-01-06

From: Vlastimil Babka <hidden>
Date: 2017-01-09 12:56:52
Also in: ceph-devel, linux-btrfs, linux-f2fs-devel, linux-fsdevel, linux-mm, linux-nfs, linux-xfs, lkml

On 01/06/2017 03:11 PM, Michal Hocko wrote:

From: Michal Hocko <mhocko@suse.com>

The current implementation of the reclaim lockup detection can lead to
false positives and those even happen and usually lead to tweak the
code to silence the lockdep by using GFP_NOFS even though the context
can use __GFP_FS just fine. See
http://lkml.kernel.org/r/20160512080321.GA18496@dastard as an example.

=================================
[ INFO: inconsistent lock state ]
4.5.0-rc2+ #4 Tainted: G           O
---------------------------------
inconsistent {RECLAIM_FS-ON-R} -> {IN-RECLAIM_FS-W} usage.
kswapd0/543 [HC0[0]:SC0[0]:HE1:SE1] takes:

(&xfs_nondir_ilock_class){++++-+}, at: [<ffffffffa00781f7>] xfs_ilock+0x177/0x200 [xfs]

{RECLAIM_FS-ON-R} state was registered at:
  [<ffffffff8110f369>] mark_held_locks+0x79/0xa0
  [<ffffffff81113a43>] lockdep_trace_alloc+0xb3/0x100
  [<ffffffff81224623>] kmem_cache_alloc+0x33/0x230
  [<ffffffffa008acc1>] kmem_zone_alloc+0x81/0x120 [xfs]
  [<ffffffffa005456e>] xfs_refcountbt_init_cursor+0x3e/0xa0 [xfs]
  [<ffffffffa0053455>] __xfs_refcount_find_shared+0x75/0x580 [xfs]
  [<ffffffffa00539e4>] xfs_refcount_find_shared+0x84/0xb0 [xfs]
  [<ffffffffa005dcb8>] xfs_getbmap+0x608/0x8c0 [xfs]
  [<ffffffffa007634b>] xfs_vn_fiemap+0xab/0xc0 [xfs]
  [<ffffffff81244208>] do_vfs_ioctl+0x498/0x670
  [<ffffffff81244459>] SyS_ioctl+0x79/0x90
  [<ffffffff81847cd7>] entry_SYSCALL_64_fastpath+0x12/0x6f

       CPU0
       ----
  lock(&xfs_nondir_ilock_class);
  <Interrupt>
    lock(&xfs_nondir_ilock_class);

 *** DEADLOCK ***

3 locks held by kswapd0/543:

stack backtrace:
CPU: 0 PID: 543 Comm: kswapd0 Tainted: G           O    4.5.0-rc2+ #4

Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006

 ffffffff82a34f10 ffff88003aa078d0 ffffffff813a14f9 ffff88003d8551c0
 ffff88003aa07920 ffffffff8110ec65 0000000000000000 0000000000000001
 ffff880000000001 000000000000000b 0000000000000008 ffff88003d855aa0
Call Trace:
 [<ffffffff813a14f9>] dump_stack+0x4b/0x72
 [<ffffffff8110ec65>] print_usage_bug+0x215/0x240
 [<ffffffff8110ee85>] mark_lock+0x1f5/0x660
 [<ffffffff8110e100>] ? print_shortest_lock_dependencies+0x1a0/0x1a0
 [<ffffffff811102e0>] __lock_acquire+0xa80/0x1e50
 [<ffffffff8122474e>] ? kmem_cache_alloc+0x15e/0x230
 [<ffffffffa008acc1>] ? kmem_zone_alloc+0x81/0x120 [xfs]
 [<ffffffff811122e8>] lock_acquire+0xd8/0x1e0
 [<ffffffffa00781f7>] ? xfs_ilock+0x177/0x200 [xfs]
 [<ffffffffa0083a70>] ? xfs_reflink_cancel_cow_range+0x150/0x300 [xfs]
 [<ffffffff8110aace>] down_write_nested+0x5e/0xc0
 [<ffffffffa00781f7>] ? xfs_ilock+0x177/0x200 [xfs]
 [<ffffffffa00781f7>] xfs_ilock+0x177/0x200 [xfs]
 [<ffffffffa0083a70>] xfs_reflink_cancel_cow_range+0x150/0x300 [xfs]
 [<ffffffffa0085bdc>] xfs_fs_evict_inode+0xdc/0x1e0 [xfs]
 [<ffffffff8124d7d5>] evict+0xc5/0x190
 [<ffffffff8124d8d9>] dispose_list+0x39/0x60
 [<ffffffff8124eb2b>] prune_icache_sb+0x4b/0x60
 [<ffffffff8123317f>] super_cache_scan+0x14f/0x1a0
 [<ffffffff811e0d19>] shrink_slab.part.63.constprop.79+0x1e9/0x4e0
 [<ffffffff811e50ee>] shrink_zone+0x15e/0x170
 [<ffffffff811e5ef1>] kswapd+0x4f1/0xa80
 [<ffffffff811e5a00>] ? zone_reclaim+0x230/0x230
 [<ffffffff810e6882>] kthread+0xf2/0x110
 [<ffffffff810e6790>] ? kthread_create_on_node+0x220/0x220
 [<ffffffff8184803f>] ret_from_fork+0x3f/0x70
 [<ffffffff810e6790>] ? kthread_create_on_node+0x220/0x220

To quote Dave:
"
Ignoring whether reflink should be doing anything or not, that's a
"xfs_refcountbt_init_cursor() gets called both outside and inside
transactions" lockdep false positive case. The problem here is
lockdep has seen this allocation from within a transaction, hence a
GFP_NOFS allocation, and now it's seeing it in a GFP_KERNEL context.
Also note that we have an active reference to this inode.

So, because the reclaim annotations overload the interrupt level
detections and it's seen the inode ilock been taken in reclaim
("interrupt") context, this triggers a reclaim context warning where
it thinks it is unsafe to do this allocation in GFP_KERNEL context
holding the inode ilock...
"

This sounds like a fundamental problem of the reclaim lock detection.
It is really impossible to annotate such a special usecase IMHO unless
the reclaim lockup detection is reworked completely. Until then it
is much better to provide a way to add "I know what I am doing flag"
and mark problematic places. This would prevent from abusing GFP_NOFS
flag which has a runtime effect even on configurations which have
lockdep disabled.

Introduce __GFP_NOLOCKDEP flag which tells the lockdep gfp tracking to
skip the current allocation request.

While we are at it also make sure that the radix tree doesn't
accidentaly override tags stored in the upper part of the gfp_mask.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Vlastimil Babka <redacted>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help