Thread (44 messages) 44 messages, 9 authors, 2022-06-02

Re: [Update PATCH V3] md: don't unregister sync_thread with reconfig_mutex held

From: Christoph Hellwig <hch@infradead.org>
Date: 2022-05-31 06:12:03
Also in: linux-block

On Thu, May 26, 2022 at 01:53:36PM +0200, Jan Kara wrote:
So I've debugged this. The crash happens on the very first bio submitted to
the md0 device. The problem is that this bio gets remapped to loop0 - this
happens through bio_alloc_clone() -> __bio_clone() which ends up calling
bio_clone_blkg_association(). Now the resulting bio is inconsistent - it's
dst_bio->bi_bdev is pointing to loop0 while dst_bio->bi_blkg is pointing to
blkcg_gq associated with md0 request queue. And this breaks BFQ because
when this bio is inserted to loop0 request queue, BFQ looks at
bio->bi_blkg->q (it is a bit more complex than that but this is the gist
of the problem), expects its data there but BFQ is not initialized for md0
request_queue.

Now I think this is a bug in __bio_clone() but the inconsistency in the bio
is very much what we asked bio_clone_blkg_association() to do so maybe I'm
missing something and bios that are associated with one bdev but pointing
to blkg of another bdev are fine and controllers are supposed to handle
that (although I'm not sure how should they do that). So I'm asking here
before I just go and delete bio_clone_blkg_association() from
__bio_clone()...
This behavior probably goes back to my commit here:

ommit d92c370a16cbe0276954c761b874bd024a7e4fac
Author: Christoph Hellwig [off-list ref]
Date:   Sat Jun 27 09:31:48 2020 +0200

    block: really clone the block cgroup in bio_clone_blkg_association

and it seems everyone else was fine with that behavior so far.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help