Re: Problem with direct IO
From: Andrew Morton <akpm@linux-foundation.org>
Date: 2021-10-18 18:43:54
Also in:
linux-ext4, linux-fsdevel
On Mon, 18 Oct 2021 09:09:06 +0800 Zhengyuan Liu [off-list ref] wrote:
Ping. I think this problem is serious and someone may also encounter it in the future. On Wed, Oct 13, 2021 at 9:46 AM Zhengyuan Liu [off-list ref] wrote:quoted
Hi, all we are encounting following Mysql crash problem while importing tables : 2021-09-26T11:22:17.825250Z 0 [ERROR] [MY-013622] [InnoDB] [FATAL] fsync() returned EIO, aborting. 2021-09-26T11:22:17.825315Z 0 [ERROR] [MY-013183] [InnoDB] Assertion failure: ut0ut.cc:555 thread 281472996733168 At the same time , we found dmesg had following message: [ 4328.838972] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! [ 4328.850234] File: /data/mysql/data/sysbench/sbtest53.ibd PID: 625 Comm: kworker/42:1 Firstly, we doubled Mysql has operating the file with direct IO and buffered IO interlaced, but after some checking we found it did only do direct IO using aio. The problem is exactly from direct-io interface (__generic_file_write_iter) itself. ssize_t __generic_file_write_iter() { ... if (iocb->ki_flags & IOCB_DIRECT) { loff_t pos, endbyte; written = generic_file_direct_write(iocb, from); /* * If the write stopped short of completing, fall back to * buffered writes. Some filesystems do this for writes to * holes, for example. For DAX files, a buffered write will * not succeed (even if it did, DAX does not handle dirty * page-cache pages correctly). */ if (written < 0 || !iov_iter_count(from) || IS_DAX(inode)) goto out; status = generic_perform_write(file, from, pos = iocb->ki_pos); ... } From above code snippet we can see that direct io could fall back to buffered IO under certain conditions, so even Mysql only did direct IO it could interleave with buffered IO when fall back occurred. I have no idea why FS(ext3) failed the direct IO currently, but it is strange __generic_file_write_iter make direct IO fall back to buffered IO, it seems breaking the semantics of direct IO.
That makes sense.
quoted
The reproduced environment is: Platform: Kunpeng 920 (arm64) Kernel: V5.15-rc PAGESIZE: 64K Mysql: V8.0 Innodb_page_size: default(16K)
This is all fairly mature code, I think. Do you know if earlier kernels were OK, and if so which versions? Thanks.