Thread (49 messages) 49 messages, 7 authors, 2021-09-28

Re: [PATCH 3/5] vfs: add a zero-initialization mode to fallocate

From: Dave Chinner <david@fromorbit.com>
Date: 2021-09-22 05:49:35
Also in: linux-xfs

On Tue, Sep 21, 2021 at 09:13:54PM -0700, Darrick J. Wong wrote:
On Wed, Sep 22, 2021 at 01:59:07PM +1000, Dave Chinner wrote:
quoted
On Tue, Sep 21, 2021 at 07:38:01PM -0700, Darrick J. Wong wrote:
quoted
On Tue, Sep 21, 2021 at 07:16:26PM -0700, Dan Williams wrote:
quoted
On Tue, Sep 21, 2021 at 1:32 AM Christoph Hellwig [off-list ref] wrote:
quoted
On Tue, Sep 21, 2021 at 10:44:31AM +1000, Dave Chinner wrote:
quoted
I think this wants to be a behavioural modifier for existing
operations rather than an operation unto itself. i.e. similar to how
KEEP_SIZE modifies ALLOC behaviour but doesn't fundamentally alter
the guarantees ALLOC provides userspace.

In this case, the change of behaviour over ZERO_RANGE is that we
want physical zeros to be written instead of the filesystem
optimising away the physical zeros by manipulating the layout
of the file.
Yes.
quoted
Then we have and API that looks like:

      ALLOC           - allocate space efficiently
      ALLOC | INIT    - allocate space by writing zeros to it
      ZERO            - zero data and preallocate space efficiently
      ZERO | INIT     - zero range by writing zeros to it

Which seems to cater for all the cases I know of where physically
writing zeros instead of allocating unwritten extents is the
preferred behaviour of fallocate()....
Agreed.  I'm not sure INIT is really the right name, but I can't come
up with a better idea offhand.
FUA? As in, this is a forced-unit-access zeroing all the way to media
bypassing any mechanisms to emulate zero-filled payloads on future
reads.
Yes, that's the semantic we want, but FUA already defines specific
data integrity behaviour in the storage stack w.r.t. volatile
caches.

Also, FUA is associated with devices - it's low level storage jargon
and so is not really appropriate to call a user interface operation
FUA where users have no idea what a "unit" or "access" actually
means.

Hence we should not overload this name with some other operation
that does not have (and should not have) explicit data integrity
requirements. That will just cause confusion for everyone.
quoted
FALLOC_FL_ZERO_EXISTING, because you want to zero the storage that
already exists at that file range?
IMO that doesn't work as a behavioural modifier for ALLOC because
the ALLOC semantics are explicitly "don't touch existing user
data"...
Well since you can't preallocate /and/ zerorange at the same time...

/* For FALLOC_FL_ZERO_RANGE, write zeroes to pre-existing mapped storage. */
#define FALLOC_FL_ZERO_EXISTING		(0x80)
Except we also want the newly allocated regions (i.e. where holes
were) in that range being zeroed to have zeroes written to them as
well, yes? Otherwise we end up with a combination of unwritten
extents and physical zeroes, and you can't use
ZERORANGE|EXISTING as a replacement for PUNCH + ALLOC|INIT

/*
 * For preallocation and zeroing operations, force the filesystem to
 * write zeroes rather than use unwritten extents to indicate the
 * range contains zeroes.
 *
 * For filesystems that support unwritten extents, this trades off
 * slow fallocate performance for faster first write performance as
 * unwritten extent conversion on the first write to each block in
 * the range is not needed.
 *
 * Care is required when using FALLOC_FL_ALLOC_INIT_DATA as it will
 * be much slower overall for large ranges and/or slow storage
 * compared to using unwritten extents.
 */
#define FALLOC_FL_ALLOC_INIT_DATA	(1 << 7)

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help