Re: [PATCH v2] mmp: do not use O_DIRECT when working with regular file
From: Alexey Lyashkov <hidden>
Date: 2021-02-20 13:22:19
Teodore, this important because of some points. metadata for the large devices (>400T without bigalloc enabled) very large. Once buffered IO enabled this generate a very large memory consumption. (12G+ for metadata itself in page cache, and 12G+ for user memory). I don’t think half of them is useful.
19 февр. 2021 г., в 19:18, Theodore Ts'o [off-list ref] написал(а): Alexey, It'd be helpful to me to understand _why_ this use case is important for your workloads. O_DIRECT support is rarely used as far as I know, and fs blocksize != page size is rare as well. The main use cases I know of fs blocksize != page size is on architectures (not terribly common) with 16k or 64k page sizes, that want to use 4k file system blocksizes for interoperability reasons.
As i point early - e2fsprogs _FORCE_ a 1k block size in some places.
Like
blk64_t ext2fs_first_backup_sb(blk64_t *superblock, unsigned int *block_size,
..
for (try_blocksize = EXT2_MIN_BLOCK_SIZE;
try_blocksize <= EXT2_MAX_BLOCK_SIZE ; try_blocksize *= 2) {
..
errcode_t ext2fs_open2(const char *name, const char *io_options,
io_channel_set_blksize(fs->io, SUPERBLOCK_OFFSET);
both cases will generate unliagned (from block device view) access.
Without any idea which a block size is in real.
(And I suppose because mke2fs uses a 4k block size by default. Perhaps we should change this so that the default is that mke2fs will use a block size == page size, unless for some reason the page size is not one supported by ext4 (although I'm not aware of any architecture wanting page sizes > 64k), or the user explicitly specifies the block size using "mke2fs -b».)
Nice. AARCH64 / RHEL8 - is 64k page, so what about interoperability? Should AARCH64 able to read devices which created on x86_64 with 4k page size?
Are you trying to make O_DIRECT support in e2fsprogs a first class reason out of completeness concern? Or is this a use case which is important in production workloads that you are familiar with?
primary goal - debugfs -D / e2image - both in production on large storages. I looking to the e2fsck because of large memory consumption. If you think O_DIRECT don’t need to be supported - lets drop this code, instead of have this completely broken now. Thanks, Alex.
Thanks, - Ted