Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)
From: Christian Borntraeger <hidden>
Date: 2017-12-14 17:32:31
Also in:
linux-s390, lkml
Independent from the issues with the dasd disks, this also seem to not enable additional hardware queues. with cpus 0,1 (and 248 cpus max) I get cpus 0 and 2-247 attached to hardware contect 0 and I get cpu 1 for hardware context 1. If I now add a cpu this does not change anything. hardware context 2,3,4 etc all have no CPU and hardware context 0 keeps sitting on all cpus (except 1). On 12/07/2017 10:20 AM, Christian Borntraeger wrote:
quoted hunk ↗ jump to hunk
On 12/07/2017 12:29 AM, Christoph Hellwig wrote:quoted
On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote: t > commit 11b2025c3326f7096ceb588c3117c7883850c068 -> badquoted
blk-mq: create a blk_mq_ctx for each possible CPU does not boot on DASD and commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc -> good genirq/affinity: assign vectors to all possible CPUs does boot with DASD disks. Also adding Stefan Haberland if he has an idea why this fails on DASD and adding Martin (for the s390 irq handling code).That is interesting as it really isn't related to interrupts at all, it just ensures that possible CPUs are set in ->cpumask. I guess we'd really want: e005655c389e3d25bf3e43f71611ec12f3012de0 "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu" before this commit, but it seems like the whole stack didn't work for your either. I wonder if there is some weird thing about nr_cpu_ids in s390?The problem starts as soon as NR_CPUS is larger than the number of real CPUs. Aquestions Wouldnt your change in blk_mq_hctx_next_cpu fail if there is more than 1 non-online cpu: e.g. dont we need something like (whitespace and indent damaged)@@ -1241,11 +1241,11 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx) if (--hctx->next_cpu_batch <= 0) { int next_cpu; + do { next_cpu = cpumask_next(hctx->next_cpu, hctx->cpumask); - if (!cpu_online(next_cpu)) - next_cpu = cpumask_next(next_cpu, hctx->cpumask); if (next_cpu >= nr_cpu_ids) next_cpu = cpumask_first(hctx->cpumask); + } while (!cpu_online(next_cpu)); hctx->next_cpu = next_cpu; hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;it does not fix the issue, though (and it would be pretty inefficient for large NR_CPUS)