Thread (7 messages) 7 messages, 3 authors, 2023-02-04

Re: block: remove submit_bio_noacct

From: Mike Snitzer <snitzer@kernel.org>
Date: 2023-02-03 16:40:26
Also in: dm-devel, linux-block
Subsystem: block layer, the rest · Maintainers: Jens Axboe, Linus Torvalds

On Fri, Feb 03 2023 at 10:00P -0500,
Christoph Hellwig [off-list ref] wrote:
On Thu, Feb 02, 2023 at 10:16:53PM -0500, Mike Snitzer wrote:
quoted
quoted
quoted
The current usage of submit_bio vs submit_bio_noacct which skips the
VM events and task account is a bit unclear.  It seems to be mostly
intended for sending on bios received by stacking drivers, but also
seems to be used for stacking drivers newly generated metadata
sometimes.
Your lack of confidence conveyed in the above shook me a little bit
considering so much of this code is attributed to you -- I mostly got
past that, but I am a bit concerned about one aspect of the
submit_bio() change (2nd to last comment inlined below).
The confidence is about how it is used.  And that's up to the driver
authors, not helped by them not having any guidelines.  And while
I've touched this code a lot, the split between the two levels of API
long predates me.
quoted
quoted
quoted
Remove the separate API and just skip the accounting if submit_bio
is called recursively.  This gets us an accounting behavior that
is very similar (but not quite identical) to the current one, while
simplifying the API and code base.
Can you elaborate on the "but not quite identical"? This patch is
pretty mechanical, just folding code and renaming.. but you obviously
saw subtle differences.  Likely worth callign those out precisely.
The explanation was supposed to be in the Lines above.  Now accounting
is skipped if in a ->submit_bio recursion.  Before that it dependent
on drivers calling either submit_bio or submit_bio_noacct, for which
there was no clear guideline and drivers have been a bit sloppy about.
OK, but afaict the code is identical after your refactoring.
Side-effect is drivers that were double accounting will now be fixed.
 
quoted
quoted
How have you tested this patch?  Seems like I should throw all the lvm
and DM tests at it.
blktests and the mdadm tests (at least as far as they got upstream, md
or it's tests always seem somewhat broken).  dmtests is something
I've never managed to get to actually run due it's insistence on
using not packaged ruby stuff.
Yeah, device-mapper-test-suite (dmts) is a PITA due to ruby dep. And
the tests have gotten a bit stale relative to recent kernels. I'm
aware of this and also about how others would like to see more DM
coverage in blktests. We'll be looking to improve DM testing but it
does always tend to get put on back-burner (but I'll be getting some
help from Ben Marzinski to assess what tests we do have, see where we
have gaps and also put effort to making DM testing part of blktests).

I'm actually now pretty interested to see which (if any) DM tests
would have caught the missing bio checks issue in your initial patch.
 
quoted
quoted
In practice this will manifest as delaying the negative checks, until
returning from active submit_bio, but they will still happen.
Actually, I don't think those checks are done at all now.
Yes, the branch needs to be later as in this version below.
Thanks.
quoted
Unless I'm missing something, this seems like it needs proper
justification and a lot of review and testing.

So why do this change?
Because I once again got a question from an auther of a driver that
is planned to be upstreamed on which one to use.  And the answer
was it's complicated, and you really should not have to think about
it, let me dig out my old patch so that driver authors don't have
to care.
That's fair, and as I tried to say in my first reply: I agree it does
clear up that confusion nicely.

Please fold this incremental patch in, with that you can add:

Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>

(not sure if my Signed-off-by needed but there you have it if so).
diff --git a/block/bio.c b/block/bio.c
index ea143fd825d7..aa0586012b0d 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -475,7 +475,7 @@ static struct bio *bio_alloc_percpu_cache(struct block_device *bdev,
  *
  * Note that when running under submit_bio() (i.e. any block driver),
  * bios are not submitted until after you return - see the code in
- * submit_bio() that converts recursion into iteration, to prevent
+ * submit_bio_nocheck() that converts recursion into iteration, to prevent
  * stack overflows.
  *
  * This would normally mean allocating multiple bios under submit_bio()
diff --git a/block/blk-core.c b/block/blk-core.c
index f755ac1a2931..c41f086cb183 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -687,7 +687,7 @@ void submit_bio_nocheck(struct bio *bio)
 	/*
 	 * We only want one ->submit_bio to be active at a time, else stack
 	 * usage with stacked devices could be a problem.  Use current->bio_list
-	 * to collect a list of requests submited by a ->submit_bio method while
+	 * to collect a list of requests submitted by a ->submit_bio method while
 	 * it is active, and then process them after it returned.
 	 */
 	if (current->bio_list)
@@ -720,13 +720,13 @@ void submit_bio(struct bio *bio)
 
 	might_sleep();
 
-	if (blkcg_punt_bio_submit(bio))
-		return;
-
 	/*
 	 * Do not double account bios that are remapped and resubmitted.
 	 */
 	if (!current->bio_list) {
+		if (blkcg_punt_bio_submit(bio))
+			return;
+
 		if (bio_op(bio) == REQ_OP_READ) {
 			task_io_account_read(bio->bi_iter.bi_size);
 			count_vm_events(PGPGIN, bio_sectors(bio));
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help