Thread (18 messages) 18 messages, 5 authors, 2021-08-03

Re: [PATCH FIXES/IMPROVEMENTS 0/7] block, bfq: preserve control, boost throughput, fix bugs

From: Piotr Górski <hidden>
Date: 2021-06-21 20:04:10
Also in: lkml

I have tested this with myself and this error does not occur with me
and I have not noticed any regressions. I have applied almost exactly
the same patches as Oleksandr.

pon., 21 cze 2021 o 21:55 Oleksandr Natalenko
[off-list ref] napisał(a):
Hello.

On sobota 19. června 2021 16:09:41 CEST Paolo Valente wrote:
quoted
Hi Jens,
this series contains an already proposed patch by Luca, plus six new
patches. The goals of these patches are summarized in the subject of
this cover letter. I'm including Luca's patch here, because it enabled
the actual use of stable merge, and, as such, triggered an otherwise
silent bug. This series contains also the fix for that bug ("block,
bfq: avoid delayed merge of async queues"), tested by Holger [1].

Thanks,
Paolo

[1] https://lkml.org/lkml/2021/5/18/384

Luca Mariotti (1):
  block, bfq: fix delayed stable merge check

Paolo Valente (5):
  block, bfq: let also stably merged queues enjoy weight raising
  block, bfq: consider also creation time in delayed stable merge
  block, bfq: avoid delayed merge of async queues
  block, bfq: check waker only for queues with no in-flight I/O
  block, bfq: reset waker pointer with shared queues

Pietro Pedroni (1):
  block, bfq: boost throughput by extending queue-merging times

 block/bfq-iosched.c | 68 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 53 insertions(+), 15 deletions(-)

--
2.20.1
Not sure everything goes fine here. After applying this series on top of the
latest stable 5.12 kernel I got this:
[16730.963248] kernel BUG at block/elevator.c:236!
[16730.963254] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[16730.963257] CPU: 11 PID: 109170 Comm: kworker/u64:5 Tainted: G        W
5.12.0-pf7 #1
[16730.963260] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS
3601 05/26/2021
[16730.963263] Workqueue: dm-thin do_worker [dm_thin_pool]
[16730.963270] RIP: 0010:elv_rqhash_find+0xcc/0xd0
[16730.963274] Code: 41 89 f0 81 e2 00 40 06 00 41 81 e0 1a 00 04 00 44 09 c2
75 a9 be 09 00 00 00 c4 e2 4b f7 50 28 48 03 50 30 48 39 fa 75 c6 c3 <0f> 0b
66 90 0f 1f 44 00 00 41 56 41 55 41 54 55 53 48 8b 47 68 48
[16730.963276] RSP: 0018:ffffa558d13b7af8 EFLAGS: 00010046
[16730.963279] RAX: ffff8a0007782d00 RBX: ffff8a0014b93000 RCX: ffffa558d13b7b78
[16730.963281] RDX: ffff8a0014b93000 RSI: 0000000000063082 RDI: 000000001e0fdc00
[16730.963283] RBP: ffff8a000731c770 R08: ffff8a000731c770 R09: fffffff0ffffddfb
[16730.963284] R10: 0000000000000000 R11: 0000000000000400 R12: ffff8a0330365c00
[16730.963286] R13: ffffa558d13b7b30 R14: 0000000000000000 R15: ffff8a0212fc4000
[16730.963288] FS:  0000000000000000(0000) GS:ffff8a070ecc0000(0000) knlGS:
0000000000000000
[16730.963290] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[16730.963292] CR2: 00007f1514d90f4c CR3: 0000000315952000 CR4:
0000000000350ee0
[16730.963294] Call Trace:
[16730.963297]  elv_merge+0x96/0x120
[16730.963300]  blk_mq_sched_try_merge+0x3e/0x370
[16730.963303]  bfq_bio_merge+0xd3/0x130
[16730.963306]  blk_mq_submit_bio+0x11e/0x6c0
[16730.963309]  submit_bio_noacct+0x457/0x530
[16730.963312]  raid10_unplug+0x13f/0x1a0 [raid10]
[16730.963316]  blk_flush_plug_list+0xa9/0x110
[16730.963319]  blk_finish_plug+0x21/0x30
[16730.963322]  process_prepared_discard_passdown_pt1+0x204/0x2d0
[dm_thin_pool]
[16730.963327]  do_worker+0x18e/0xce0 [dm_thin_pool]
[16730.963335]  process_one_work+0x217/0x3e0
[16730.963338]  worker_thread+0x4d/0x470
[16730.963343]  kthread+0x182/0x1b0
[16730.963349]  ret_from_fork+0x22/0x30

[16730.963419] ---[ end trace dd7e037f2028257b ]---
[16730.963524] RIP: 0010:elv_rqhash_find+0xcc/0xd0

[16730.963547] note: kworker/u64:5[109170] exited with preempt_count 1
[16747.948467] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for
fences timed out!
Which is:
229 struct request *elv_rqhash_find(struct request_queue *q, sector_t offset)
230 {

235     hash_for_each_possible_safe(e->hash, rq, next, hash, offset) {
236         BUG_ON(!ELV_ON_HASH(rq));

Yes, I carry some extra patches besides this series (the list is against v5.12
GA):
block, bfq: reset waker pointer with shared queues
block, bfq: check waker only for queues with no in-flight I/O
block, bfq: avoid delayed merge of async queues
block, bfq: boost throughput by extending queue-merging times
block, bfq: consider also creation time in delayed stable merge
block, bfq: fix delayed stable merge check
block, bfq: let also stably merged queues enjoy weight raising
block: Do not pull requests from the scheduler when we cannot dispatch them
blk: Fix lock inversion between ioc lock and bfqd lock
bfq: Remove merged request already in bfq_requests_merged()
block, bfq: avoid circular stable merges
bfq: remove unnecessary BFQ_DEFAULT_GRP_IOPRIO
bfq: reset entity->prio_changed in bfq_init_entity()
bfq: optimize the calculation of bfq_weight_to_ioprio()
bfq: remove unnecessary initialization logic
bfq: keep the minimun bandwidth for CLASS_BE
bfq: limit the IO depth of CLASS_IDLE to 1
bfq: convert the type of bfq_group.bfqd to bfq_data*
bfq: introduce bfq_entity_to_bfqg helper method
bfq/mq-deadline: remove redundant check for passthrough request
blk-mq: bypass IO scheduler's limit_depth for passthrough request
block,bfq: fix the timeout calculation in bfq_bfqq_charge_time
block, bfq: merge bursts of newly-created queues
block, bfq: keep shared queues out of the waker mechanism
block, bfq: fix weight-raising resume with !low_latency
block, bfq: make shared queues inherit wakers
block, bfq: put reqs of waker and woken in dispatch list
block, bfq: always inject I/O of queues blocked by wakers
but nothing from there triggered this for quite some time.

Paolo, what do you think?

--
Oleksandr Natalenko (post-factum)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help