Thread (18 messages) 18 messages, 3 authors, 1d ago

Re: [PATCH v6 3/3] xfs: add support for FALLOC_FL_WRITE_ZEROES

From: Pankaj Raghav (Samsung) <hidden>
Date: 2026-06-17 09:45:38
Also in: linux-fsdevel, linux-xfs

On Tue, Jun 16, 2026 at 06:31:40AM -0700, Christoph Hellwig wrote:
[API questions for Zhang and -fsdevel/ -api below)
quoted
+	unsigned int		blksize = i_blocksize(inode);
+	loff_t			offset_aligned = round_down(offset, blksize);
I think this actually needs to found up instead of rounding down.
quoted
+	/*
+	 * Zero the tail of the old EOF block and any space up to the new
+	 * offset.
+	 * In the usual truncate path, xfs_falloc_setsize takes care of
+	 * zeroing those blocks.
+	 */
+	if (offset_aligned > old_size) {
+		trace_xfs_zero_eof(ip, old_size, offset_aligned - old_size);
+		error = xfs_zero_range(ip, old_size, offset_aligned - old_size,
+				NULL, &did_zero);
+		if (error)
+			return error;
+	}
... then this will properly zero from the old i_size to the first block
boundary after the old size.
Hmm, right now we do this:

|----------|----------|----------|
    ^      ^     ^    ^
    |      |     |    |
 old_size  |   offset |
           |          |
	off_rd       off_ru

At the moment, we zero out old_size to off_rd and pass offset to
xfs_alloc_file_space. xfs_alloc_file_space rounds down the offset to off_rd.

What you are proposing is to zero out old_size to off_ru, and pass
off_ru to xfs_alloc_file_space. I don't exactly understand the
difference.
quoted
+	error = xfs_alloc_file_space(ip, offset, len,
+			XFS_ALLOC_FILE_SPACE_WRITE_ZEROES);
... and here we need to pass offset_aligned instead of offset and
a new calculated len based on the last block boundary, and then
zero again after that.  That is assuming FALLOC_FL_WRITE_ZEROES
allows unaligned ranges for file systems.  The block code doesn't,
but I can't quite follow the ext4 code if it does or not, and there
is no mention of FALLOC_FL_WRITE_ZEROES even in the latest man-pages
tree.

I can't find any references to FALLOC_FL_WRITE_ZEROES in the man pages
master branch. Maybe we missed it. I can send a separate patch for that
once we have some clarity on the API.
Maybe we also want xfstests that try unaligned FALLOC_FL_WRITE_ZEROES
and make sure no existing data before the range is lost and the
entire range is zeroed?
I added FALLOC_FL_WRITE_ZEROES support to ltp (both fsx and fsstress).
For example, generic/363 tests for unaligned writes and checks for any
stale data. By default, I think we do unaligned reads, writes and
truncate in fsx.
quoted
+	if (error)
+		return error;
+
+	/*
+	 * xfs_falloc_setsize() would re-zero the written extents via
+	 * iomap_zero_range(). Use xfs_setfilesize() instead.
+	 * Update in-core i_size first as xfs_setfilesize() clamps the on-disk
+	 * size to it.
+	 */
+	if (new_size > i_size_read(inode))
+		i_size_write(inode, new_size);
I think Sashiko is right that we need a pagecache_isize_extended and
filemap_write_and_wait_range calls here.
Ok. Current fsx or fsstress did not expose this
problem. I will look into this. Thanks Christoph.

--
Pankaj
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help