Re: [PATCHSET v6] blk-mq scheduling framework
From: Jens Axboe <axboe@kernel.dk>
Date: 2017-01-16 15:47:23
Also in:
lkml
On 01/16/2017 08:16 AM, Jens Axboe wrote:
On 01/16/2017 08:12 AM, Jens Axboe wrote:quoted
On 01/16/2017 01:11 AM, Hannes Reinecke wrote:quoted
On 01/13/2017 05:02 PM, Jens Axboe wrote:quoted
On 01/13/2017 09:00 AM, Jens Axboe wrote:quoted
On 01/13/2017 08:59 AM, Hannes Reinecke wrote:quoted
On 01/13/2017 04:34 PM, Jens Axboe wrote:quoted
On 01/13/2017 08:33 AM, Hannes Reinecke wrote:[ .. ]quoted
quoted
Ah, indeed. There is an ominous udev rule here, trying to switch to 'deadline'. # cat 60-ssd-scheduler.rules # do not edit this file, it will be overwritten on update ACTION!="add", GOTO="ssd_scheduler_end" SUBSYSTEM!="block", GOTO="ssd_scheduler_end" IMPORT{cmdline}="elevator" ENV{elevator}=="*?", GOTO="ssd_scheduler_end" KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline" LABEL="ssd_scheduler_end" Still shouldn't crash the kernel, though ...Of course not, and it's not a given that it does, it could just be triggering after the device load and failing like expected. But just in case, can you try and disable that rule and see if it still crashes with MQ_DEADLINE set as the default?Yes, it does. Same stacktrace as before.Alright, that's as expected. I've tried with your rule and making everything modular, but it still boots fine for me. Very odd. Can you send me your .config? And are all the SCSI disks hanging off ahci? Or sdb specifically, is that ahci or something else?Also, would be great if you could pull: git://git.kernel.dk/linux-block blk-mq-sched into current 'master' and see if it still reproduces. I expect that it will, but just want to ensure that it's a problem in the current code base as well.Actually, it doesn't. Seems to have resolved itself with the latest drop. However, not I've got a lockdep splat: Jan 16 09:05:02 lammermuir kernel: ------------[ cut here ]------------ Jan 16 09:05:02 lammermuir kernel: WARNING: CPU: 29 PID: 5860 at kernel/locking/lockdep.c:3514 lock_release+0x2a7/0x490 Jan 16 09:05:02 lammermuir kernel: DEBUG_LOCKS_WARN_ON(depth <= 0) Jan 16 09:05:02 lammermuir kernel: Modules linked in: raid0 mpt3sas raid_class rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache e Jan 16 09:05:02 lammermuir kernel: fb_sys_fops ahci uhci_hcd ttm ehci_pci libahci ehci_hcd serio_raw crc32c_intel drm libata usbcore hpsa Jan 16 09:05:02 lammermuir kernel: CPU: 29 PID: 5860 Comm: fio Not tainted 4.10.0-rc3+ #540 Jan 16 09:05:02 lammermuir kernel: Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 Jan 16 09:05:02 lammermuir kernel: Call Trace: Jan 16 09:05:02 lammermuir kernel: dump_stack+0x85/0xc9 Jan 16 09:05:02 lammermuir kernel: __warn+0xd1/0xf0 Jan 16 09:05:02 lammermuir kernel: ? aio_write+0x118/0x170 Jan 16 09:05:02 lammermuir kernel: warn_slowpath_fmt+0x4f/0x60 Jan 16 09:05:02 lammermuir kernel: lock_release+0x2a7/0x490 Jan 16 09:05:02 lammermuir kernel: ? blkdev_write_iter+0x89/0xd0 Jan 16 09:05:02 lammermuir kernel: aio_write+0x138/0x170 Jan 16 09:05:02 lammermuir kernel: do_io_submit+0x4d2/0x8f0 Jan 16 09:05:02 lammermuir kernel: ? do_io_submit+0x413/0x8f0 Jan 16 09:05:02 lammermuir kernel: SyS_io_submit+0x10/0x20 Jan 16 09:05:02 lammermuir kernel: entry_SYSCALL_64_fastpath+0x23/0xc6Odd, not sure that's me. What did you pull my branch into? And what is the sha of the stuff you pulled in?Forgot to ask, please send me the fio job you ran here.
Nevermind, it's a mainline bug that's fixed in -rc4:
commit a12f1ae61c489076a9aeb90bddca7722bf330df3
Author: Shaohua Li [off-list ref]
Date: Tue Dec 13 12:09:56 2016 -0800
aio: fix lock dep warning
--
Jens Axboe