Re: [PATCH v2 2/2] loop: use task_work for autoclear operation

[PATCH v2 1/2] block: export task_work_add() · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2022-01-07
[PATCH v2 2/2] loop: use task_work for autoclear operation · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2022-01-07
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Jan Stancek <jstancek@redhat.com> · 2022-01-10
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Jan Kara <jack@suse.cz> · 2022-01-10
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2022-01-10
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Jan Kara <jack@suse.cz> · 2022-01-10
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2022-01-10
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Jan Kara <jack@suse.cz> · 2022-01-12
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2022-01-12
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Jan Kara <jack@suse.cz> · 2022-01-13
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2022-01-14
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Jan Kara <jack@suse.cz> · 2022-01-14
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Jan Kara <jack@suse.cz> · 2022-01-13
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2022-01-14
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Jan Kara <jack@suse.cz> · 2022-01-14
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2022-01-15
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Christoph Hellwig <hch@lst.de> · 2022-01-17
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2022-01-17
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2022-01-17
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> · 2022-01-18
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Christoph Hellwig <hch@lst.de> · 2022-01-18
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Christoph Hellwig <hch@lst.de> · 2022-01-20
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Jan Kara <jack@suse.cz> · 2022-01-20
Re: [PATCH v2 2/2] loop: use task_work for autoclear operation · Christoph Hellwig <hch@lst.de> · 2022-01-10

From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: 2022-01-14 11:06:00
Subsystem: block layer, the rest · Maintainers: Jens Axboe, Linus Torvalds

On 2022/01/14 0:23, Jan Kara wrote:

Well, we cannot guarantee what will be state of the loop device when you
open it but I think we should guarantee that once you have loop device
open, it will not be torn down under your hands. And now that I have
realized there are those lo_state checks, I think everything is fine in
that regard. I wanted to make sure that sequence such as:

Well, we could abort __loop_clr_fd() if somebody called "open()", something
like below. But

----------------------------------------

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index b1b05c45c07c..960db2c484ab 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c

@@ -1082,7 +1082,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
 	return error;
 }
 
-static void __loop_clr_fd(struct loop_device *lo)
+static bool __loop_clr_fd(struct loop_device *lo, int expected_refcnt)
 {
 	struct file *filp;
 	gfp_t gfp = lo->old_gfp_mask;

@@ -1104,9 +1104,19 @@ static void __loop_clr_fd(struct loop_device *lo)
 	 * Since this function is called upon "ioctl(LOOP_CLR_FD)" xor "close()
 	 * after ioctl(LOOP_CLR_FD)", it is a sign of something going wrong if
 	 * lo->lo_state has changed while waiting for lo->lo_mutex.
+	 *
+	 * However, if somebody called "open()" after lo->lo_state became
+	 * Lo_rundown, we should abort rundown in order to avoid unexpected
+	 * I/O error.
 	 */
 	mutex_lock(&lo->lo_mutex);
 	BUG_ON(lo->lo_state != Lo_rundown);
+	if (atomic_read(&lo->lo_refcnt) != expected_refcnt) {
+		lo->lo_state = Lo_bound;
+		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
+		mutex_unlock(&lo->lo_mutex);
+		return false;
+	}
 	mutex_unlock(&lo->lo_mutex);
 
 	if (test_bit(QUEUE_FLAG_WC, &lo->lo_queue->queue_flags))

@@ -1165,6 +1175,7 @@ static void __loop_clr_fd(struct loop_device *lo)
 		lo->lo_disk->flags |= GENHD_FL_NO_PART;
 
 	fput(filp);
+	return true;
 }
 
 static void loop_rundown_completed(struct loop_device *lo)

@@ -1181,11 +1192,12 @@ static void loop_rundown_workfn(struct work_struct *work)
 					      rundown_work);
 	struct block_device *bdev = lo->lo_device;
 	struct gendisk *disk = lo->lo_disk;
+	const bool cleared = __loop_clr_fd(lo, 0);
 
-	__loop_clr_fd(lo);
 	kobject_put(&bdev->bd_device.kobj);
 	module_put(disk->fops->owner);
-	loop_rundown_completed(lo);
+	if (cleared)
+		loop_rundown_completed(lo);
 }
 
 static void loop_schedule_rundown(struct loop_device *lo)

@@ -1228,8 +1240,8 @@ static int loop_clr_fd(struct loop_device *lo)
 	lo->lo_state = Lo_rundown;
 	mutex_unlock(&lo->lo_mutex);
 
-	__loop_clr_fd(lo);
-	loop_rundown_completed(lo);
+	if (__loop_clr_fd(lo, 1))
+		loop_rundown_completed(lo);
 	return 0;
 }
 
----------------------------------------

Currently, we may hold both. With your async patch we hold only lo_mutex.
Now that I better understand the nature of the deadlock, I agree that
holding either still creates the deadlock possibility because both are
acquired on loop device open. But now that I reminded myself the lo_state
handling, I think the following should be safe in __loop_clr_fd:

	/* Just a safety check... */
	if (WARN_ON_ONCE(data_race(lo->lo_state) != Lo_rundown))
		return -ENXIO;

this is still racy, for somebody can reach lo_open() right after this check.

Anyway, since the problem that umount() immediately after close() (reported by
kernel test robot) remains, we need to make sure that __loop_clr_fd() completes
before close() returns to user mode.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help