Thread (16 messages) 16 messages, 5 authors, 2018-03-05

Re: schedule under irqs_disabled in SLUB problem

From: Sam Kappen <hidden>
Date: 2018-03-05 08:47:30

Possibly related (same subject, not in this thread)

On Tue, Dec 12, 2017 at 3:48 PM, Sebastian Andrzej Siewior
[off-list ref] wrote:
On 2017-12-05 22:01:19 [+0530], Sam Kappen wrote:
quoted
Hi,
Hi,
quoted
Thanks for looking at my queries. Please see my answers inline.
please don't top-post. Please use a client which adds proper indention
while quoting the email.
quoted
 1.)
quoted
I had derived and tried a patch based on the below analysis.
( I referred below open source commit, to derive on this patch.
https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v4.9.47-rt37-rebase&id=7a347757f027190c95a363a491c18156a926a370
)
We see this issue when there is a state change for irqs from disabled
to enabled. During slab allocations for SCSI on bootup
the irqs are found to be in disabled state since the system state is
not yet in "RUNNING".

So we have added instrument code throughout the call trace and
confirmed culprit as pi_lock()/pi_unlock for changing the irqs state.
Basically it happens when it acquires the lock with irqs in disabled state.
but by pi_lock/pi_unlock you don't mean the futex operation, do you?

based on the fact that the system is not in state "running" yet and this
trace here:
quoted
------------[ cut here ]------------
WARNING: at kernel/sched/core.c:3052 migrate_disable+0x10b/0x120()
Modules linked in:
CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 3.10.107-rt120+ #49
Hardware name: To be filled by O.E.M. To be filled by
quoted
Call Trace:
quoted
 [<ffffffff8105fcd5>] warn_slowpath_null+0x15/0x20
 [<ffffffff8109569c>] migrate_enable+0x14c/0x200
 [<ffffffff81100fb1>] get_page_from_freelist+0x9a1/0xbc0
 [<ffffffff81101f89>] __alloc_pages_nodemask+0x179/0xa50
 [<ffffffff81138ab1>] alloc_pages_current+0x101/0x1f0
 [<ffffffff8113cf95>] new_slab+0x265/0x310
 [<ffffffff816b386e>] __slab_alloc.isra.62+0x4e0/0x6ca
 [<ffffffff8113f5d0>] kmem_cache_alloc+0x170/0x190
 [<ffffffff810fbd0a>] mempool_alloc_slab+0x3a/0x70
 [<ffffffff810fc0be>] mempool_alloc+0xae/0x210
 [<ffffffff812d5ce8>] get_request+0x3a8/0x7c0
 [<ffffffff812d619a>] blk_get_request+0x9a/0x140
 [<ffffffff813ef02a>] scsi_execute+0x4a/0x170
quoted
---[ end trace 0000000000000001 ]---
I would that this is the same issue and the patch I posted should help.
quoted
quoted
2.) With your patch during the slab allocations irqs will be in enabled state.
Thanks. I have been testing your patch, I will update once I finish the long
run test.
Okay, so a note to myself, there is nothing outstanding for me to do so
far.
We have tested it for nearly a month and issue is not reproducible with your patch. Many thanks.
quoted
Regards,
Sam
Sebastian
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help