Thread (92 messages) 92 messages, 7 authors, 2021-11-03

Re: [PATCH v8 11/12] zram: fix crashes with cpu hotplug multistate

From: Ming Lei <hidden>
Date: 2021-10-21 00:39:54
Also in: linux-block, linux-fsdevel, linux-kselftest, lkml

On Wed, Oct 20, 2021 at 08:48:04AM -0700, Luis Chamberlain wrote:
On Wed, Oct 20, 2021 at 09:15:20AM +0800, Ming Lei wrote:
quoted
On Tue, Oct 19, 2021 at 12:36:42PM -0700, Luis Chamberlain wrote:
quoted
On Wed, Oct 20, 2021 at 12:29:53AM +0800, Ming Lei wrote:
quoted
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index d0cae7a42f4d..a14ba3d350ea 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1704,12 +1704,12 @@ static void zram_reset_device(struct zram *zram)
 	set_capacity_and_notify(zram->disk, 0);
 	part_stat_set_all(zram->disk->part0, 0);
 
-	up_write(&zram->init_lock);
 	/* I/O operation under all of CPU are done so let's free */
 	zram_meta_free(zram, disksize);
 	memset(&zram->stats, 0, sizeof(zram->stats));
 	zcomp_destroy(comp);
 	reset_bdev(zram);
+	up_write(&zram->init_lock);
 }
 
 static ssize_t disksize_store(struct device *dev,
With this, it still ends up in a state where we loop and can't get out of:

zram: Can't change algorithm for initialized device
Again, you are running two zram02.sh[1] on /dev/zram0, that isn't unexpected
You mean that it is not expected? If so then yes, of course.
My meaning is clear: it is not unexpected, so it is expected.
quoted
behavior. Here the difference is just timing.
Right, but that is what helped reproduce a difficutl to re-produce customer
bug. Once you find an easy way to reproduce a reported issue you stick
with it and try to make the situation worse to ensure no more bugs are
present.
quoted
Also you did not answer my question about your test expected result when
running the following script from two terminal concurrently:

	while true; do
		PATH=$PATH:$PWD:$PWD/../../../lib/ ./zram02.sh;
	done
If you run this, you should see no failures.
OK, not see any failure when running single zram02.sh after applying my
patch V2.
Once you start a second script that one should cause odd issues on both
sides but never crash or stall the module.
crash can't be observed with my patch V2, what do you mean 'stall'
the module? Is that 'zram' can't be unloaded after the test is
terminated via multiple 'ctrl-c'?
A second series of tests is hitting CTRL-C on either randonly and
restarting testing once again randomly.
ltp/zram02.sh has cleanup handler via trap to clean everything(swapoff/umount/reset/
rmmod), ctrl-c will terminate current forground task and cause shell to run the
cleanup handler first, but further 'ctrl-c' will terminate the cleanup handler,
then the cleanup won't be done completely, such as zram disk is left as swap
device and zram can't be unloaded. The idea can be observed via the following
script:

	#!/bin/bash
	trap 'echo "enter trap"; sleep 20; echo "exit trap";' INT
	sleep 30

After the above script is run foreground, when 1st ctrl-c is pressed, 'sleep 30'
is terminated, then the trap command is run, so you can see "enter trap"
dumped. Then if you pressed 2nd ctrl-c, 'sleep 20' is terminated immediately.
So 'swapoff' from zram02.sh's trap function can be terminated in this way.

zram disk being left as swap disk can be observed with your patch too
after terminating via multiple ctrl-c which has to be done this way because
the test is dead loop.

So it is hard to cleanup everything completely after multiple 'CTRL-C' is
involved, and it should be impossible. It needs violent multiple ctrl-c to
terminate the dealoop test.

So it isn't reasonable to expect that zram can be always unloaded successfully
after the test script is terminated via multiple ctrl-c.

But zram can be unloaded after running swapoff manually, from driver
viewpoint, nothing is wrong.
Again, neither should crash the kernel or stall the module.

In the end of these tests you should be able to run the script alone
just once and not see issues.

Thanks,
Ming
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help