Re: [PATCH 4/5] ext4: fallocate support in ext4
From: Andrew Morton <akpm@linux-foundation.org>
Date: 2007-05-07 23:31:54
Also in:
linux-fsdevel, linux-xfs, lkml
On Mon, 7 May 2007 19:14:42 -0400 Theodore Tso [off-list ref] wrote:
On Mon, May 07, 2007 at 03:38:56PM -0700, Andrew Morton wrote:quoted
quoted
Actually, this is a non-issue. The reason that it is handled for extent-only is that this is the only way to allocate space in the filesystem without doing the explicit zeroing. For other filesystems (including ext3 and ext4 with block-mapped files) the filesystem should return an error (e.g. -EOPNOTSUPP) and glibc will do manual zero-filling of the file in userspace.It can be a bit suboptimal from the layout POV. The reservations code will largely save us here, but kernel support might make it a bit better.Actually, the reservations code won't matter, since glibc will fall back to its current behavior, which is it will do the preallocation by explicitly writing zeros to the file.
No! Reservations code is *critical* here. Without reservations, we get disastrously-bad layout if two processes were running a large fallocate() at the same time. (This is an SMP-only problem, btw: on UP the timeslice lengths save us). My point is that even though reservations save us, we could do even-better in-kernel. But then, a smart application would bypass the glibc() fallocate() implementation and would tune the reservation window size and would use direct-IO or sync_file_range()+fadvise(FADV_DONTNEED).
This wlil result in the same layout as if we had done the persistent preallocation, but of course it will mean the posix_fallocate() could potentially take a long time if you're a PVR and you're reserving a gig or two for a two hour movie at high quality. That seems suboptimal, granted, and ideally the application should be warned about this before it calls posix_fallocate(). On the other hand, it's what happens today, all the time, so applications won't be too badly surprised.
A PVR implementor would take all this over and would do it themselves, for sure.
If we think applications programmers badly need to know in advance if posix_fallocate() will be fast or slow, probably the right thing is to define a new fpathconf() configuration option so they can query to see whether a particular file will support a fast posix_fallocate(). I'm not 100% convinced such complexity is really needed, but I'm willing to be convinced.... what do folks think?
An application could do sys_fallocate(one-byte) to work out whether it's supported in-kernel, I guess.