Thread (62 messages) 62 messages, 8 authors, 2017-01-16

Re: [PATCHSET v6] blk-mq scheduling framework

From: Jens Axboe <axboe@kernel.dk>
Date: 2017-01-16 15:47:23
Also in: lkml

On 01/16/2017 08:16 AM, Jens Axboe wrote:
On 01/16/2017 08:12 AM, Jens Axboe wrote:
quoted
On 01/16/2017 01:11 AM, Hannes Reinecke wrote:
quoted
On 01/13/2017 05:02 PM, Jens Axboe wrote:
quoted
On 01/13/2017 09:00 AM, Jens Axboe wrote:
quoted
On 01/13/2017 08:59 AM, Hannes Reinecke wrote:
quoted
On 01/13/2017 04:34 PM, Jens Axboe wrote:
quoted
On 01/13/2017 08:33 AM, Hannes Reinecke wrote:
[ .. ]
quoted
quoted
Ah, indeed.
There is an ominous udev rule here, trying to switch to 'deadline'.

# cat 60-ssd-scheduler.rules
# do not edit this file, it will be overwritten on update

ACTION!="add", GOTO="ssd_scheduler_end"
SUBSYSTEM!="block", GOTO="ssd_scheduler_end"

IMPORT{cmdline}="elevator"
ENV{elevator}=="*?", GOTO="ssd_scheduler_end"

KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0",
ATTR{queue/scheduler}="deadline"

LABEL="ssd_scheduler_end"

Still shouldn't crash the kernel, though ...
Of course not, and it's not a given that it does, it could just be
triggering after the device load and failing like expected. But just in
case, can you try and disable that rule and see if it still crashes with
MQ_DEADLINE set as the default?
Yes, it does.
Same stacktrace as before.
Alright, that's as expected. I've tried with your rule and making
everything modular, but it still boots fine for me. Very odd. Can you
send me your .config? And are all the SCSI disks hanging off ahci? Or
sdb specifically, is that ahci or something else?
Also, would be great if you could pull:

git://git.kernel.dk/linux-block blk-mq-sched

into current 'master' and see if it still reproduces. I expect that it
will, but just want to ensure that it's a problem in the current code
base as well.
Actually, it doesn't. Seems to have resolved itself with the latest drop.

However, not I've got a lockdep splat:

Jan 16 09:05:02 lammermuir kernel: ------------[ cut here ]------------
Jan 16 09:05:02 lammermuir kernel: WARNING: CPU: 29 PID: 5860 at
kernel/locking/lockdep.c:3514 lock_release+0x2a7/0x490
Jan 16 09:05:02 lammermuir kernel: DEBUG_LOCKS_WARN_ON(depth <= 0)
Jan 16 09:05:02 lammermuir kernel: Modules linked in: raid0 mpt3sas
raid_class rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache e
Jan 16 09:05:02 lammermuir kernel:  fb_sys_fops ahci uhci_hcd ttm
ehci_pci libahci ehci_hcd serio_raw crc32c_intel drm libata usbcore hpsa
Jan 16 09:05:02 lammermuir kernel: CPU: 29 PID: 5860 Comm: fio Not
tainted 4.10.0-rc3+ #540
Jan 16 09:05:02 lammermuir kernel: Hardware name: HP ProLiant ML350p
Gen8, BIOS P72 09/08/2013
Jan 16 09:05:02 lammermuir kernel: Call Trace:
Jan 16 09:05:02 lammermuir kernel:  dump_stack+0x85/0xc9
Jan 16 09:05:02 lammermuir kernel:  __warn+0xd1/0xf0
Jan 16 09:05:02 lammermuir kernel:  ? aio_write+0x118/0x170
Jan 16 09:05:02 lammermuir kernel:  warn_slowpath_fmt+0x4f/0x60
Jan 16 09:05:02 lammermuir kernel:  lock_release+0x2a7/0x490
Jan 16 09:05:02 lammermuir kernel:  ? blkdev_write_iter+0x89/0xd0
Jan 16 09:05:02 lammermuir kernel:  aio_write+0x138/0x170
Jan 16 09:05:02 lammermuir kernel:  do_io_submit+0x4d2/0x8f0
Jan 16 09:05:02 lammermuir kernel:  ? do_io_submit+0x413/0x8f0
Jan 16 09:05:02 lammermuir kernel:  SyS_io_submit+0x10/0x20
Jan 16 09:05:02 lammermuir kernel:  entry_SYSCALL_64_fastpath+0x23/0xc6
Odd, not sure that's me. What did you pull my branch into? And what is the
sha of the stuff you pulled in?
Forgot to ask, please send me the fio job you ran here.
Nevermind, it's a mainline bug that's fixed in -rc4:

commit a12f1ae61c489076a9aeb90bddca7722bf330df3
Author: Shaohua Li [off-list ref]
Date:   Tue Dec 13 12:09:56 2016 -0800

    aio: fix lock dep warning

-- 
Jens Axboe
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help