Thread (19 messages) 19 messages, 9 authors, 2021-01-19

Re: fallocate(FALLOC_FL_ZERO_RANGE_BUT_REALLY) to avoid unwritten extents?

From: Andreas Dilger <hidden>
Date: 2021-01-12 18:41:00
Also in: linux-ext4, linux-fsdevel, linux-xfs

On Jan 12, 2021, at 11:16 AM, Christoph Hellwig [off-list ref] wrote:
On Mon, Jan 04, 2021 at 09:57:48PM +0200, Avi Kivity wrote:
quoted
quoted
I don't have a strong opinion on it. A complex userland application can
do a bit better job managing queue depth etc, but otherwise I suspect
doing the IO from kernel will win by a small bit. And the queue-depth
issue presumably would be relevant for write-zeroes as well, making me
lean towards just using the fallback.
The new flag will avoid requiring DMA to transfer the entire file size, and
perhaps can be implemented in the device by just adjusting metadata. So
there is potential for the new flag to be much more efficient.
We already support a WRITE_ZEROES operation, which many (but not all)
NVMe devices and some SCSI devices support.  The blkdev_issue_zeroout
helper can use those, or falls back to writing actual zeroes.

XFS already has a XFS_IOC_ALLOCSP64 that is defined to actually
allocate written extents.  It does not currently use
blkdev_issue_zeroout, but could be changed pretty trivially to do so.
quoted
But note it will need to be plumbed down to md and dm to be generally
useful.
DM and MD already support mddev_check_write_zeroes, at least for the
usual targets.
Similarly, ext4 also has EXT4_GET_BLOCKS_CREATE_ZERO that can allocate zero
filled extents rather than unwritten extents (without clobbering existing
data like FALLOC_FL_ZERO_RANGE does), and just needs a flag from fallocate()
to trigger it.  This is plumbed down to blkdev_issue_zeroout() as well.

Cheers, Andreas




Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help