Thread (283 messages) 283 messages, 37 authors, 2007-07-12

Re: [PATCH 4/7][TAKE5] support new modes in fallocate

From: Amit K. Arora <hidden>
Date: 2007-07-02 11:47:27
Also in: linux-fsdevel, linux-xfs, lkml

On Mon, Jul 02, 2007 at 08:55:43AM +1000, David Chinner wrote:
On Sat, Jun 30, 2007 at 11:21:11AM +0100, Christoph Hellwig wrote:
quoted
On Tue, Jun 26, 2007 at 04:02:47PM +0530, Amit K. Arora wrote:
quoted
quoted
Can you clarify - what is the current behaviour when ENOSPC (or some other
error) is hit?  Does it keep the current fallocate() or does it free it?
Currently it is left on the file system implementation. In ext4, we do
not undo preallocation if some error (say, ENOSPC) is hit. Hence it may
end up with partial (pre)allocation. This is inline with dd and
posix_fallocate, which also do not free the partially allocated space.
I can't find anything in the specification of posix_fallocate
(http://www.opengroup.org/onlinepubs/009695399/functions/posix_fallocate.html)
that tells what should happen to allocate blocks on error.
Yeah, and AFAICT glibc leaves them behind ATM.
Yes, it does.
 
quoted
But common sense would be to not leak disk space on failure of this
syscall, and this definitively should not be left up to the filesystem,
either we always leak it or always free it, and I'd strongly favour
the latter variant.
I would not call it a "leak", since the blocks which got allocated as
part of the partial success of the fallocate syscall can be strictly
accounted for (i.e. they are assigned to a particular inode). And these
can be freed by the application, using a suitable @mode of fallocate.
 
We can't simply walk the range an remove unwritten extents, as some
of them may have been present before the fallocate() call. That
makes it extremely difficult to undo a failed call and not remove
more pre-existing pre-allocations.
Same is true for ext4 too. It is very difficult to keep track of which
uninitialized (unwritten) extents got allocated as part of the current
syscall. This is because, as David mentions, some of them might be
already present; and also because some of the older ones may have got
merged with the *new* uninitialized/unwritten extents as part of the
current syscall. 
 
Given the current behaviour for posix_fallocate() in glibc, I think
that retaining the same error semantic and punting the cleanup to
userspace (where the app will fail with ENOSPC anyway) is the only
sane thing we can do here. Trying to undo this in the kernel leads
to lots of extra rarely used code in error handling paths...
Right. This gives applications the free hand if they really want to use
the partially preallocated space, OR they want to free it; without
introducing additional complexity in the kernel.

--
Regards,
Amit Arora
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help