Re: jbd2 task hung in jbd2_journal_commit_transaction
From: "Theodore Ts'o" <tytso@mit.edu>
Date: 2021-03-17 17:06:46
On Wed, Mar 17, 2021 at 08:30:56PM +0530, Shashidhar Patil wrote:
Hi Theodore,
Thank you for the details about the journalling layer and
insight into the block device layer.
I think Good luck might have clicked. The swap file in our case is
attached to a loop block device before enabling swap using swapon.
Since loop driver processes its IO requests by calling
vfs_iter_write() the write requests re-enter the ext4
filesystem/journalling code.
Is that right ? There seems to be a possibility of cylic dependency.If that hypothesis is correct, you should see an example of that in one of your stack traces; do you? The loop device creates struct file where the file is opened using O_DIRECT. In the O_DIRECT code path, assuming the file was fully allocate and initialized, it shouldn't involve starting a journal handle. That being said, why are you using a loop device for a swap device at all? Using a swap file directly is going to be much more efficient, and decrease the stack depth and CPU cycles needed to do a swap out if nothing else. If you can reliably reproduce the problem, what happens if you use a swap file directly and cut out the loop device as a swap device? Does it make the problem go away? - Ted