Thread (4 messages) 4 messages, 4 authors, 2021-05-04

Re: mkfs is broken due to platform_zero_range

From: Dave Chinner <david@fromorbit.com>
Date: 2021-05-04 00:51:13

On Mon, May 03, 2021 at 05:20:53PM -0700, Darrick J. Wong wrote:
So... I have a machine with an nvme drive manufactured by a certain
manufacturer who isn't known for the quality of their firmware
implementation.  I'm pretty sure that this is a result of the use of
fallocate(FALLOC_FL_ZERO_RANGE) to zero the log during format.

If I format a device, mounting and repair both fail because the primary
superblock UUID doesn't match the log UUID:
.....
And the format works this time too:

[root@abacus654 ~]# strace -s99 -o /tmp/a mkfs.xfs /dev/nvme0n1  -f
meta-data=/dev/nvme0n1           isize=512    agcount=6, agsize=268435455 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=1542990848, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
(reverse-i-search)`-n': od -tx1 -Ad -c /tmp/badlog3 | head ^C15
[root@abacus654 ~]# xfs_repair -n /dev/nvme0n1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...

In conclusion, the drive firmware is broken.

Question: Should we be doing /some/ kind of re-read after a zeroing the
log to detect these sh*tty firmwares and fall back to a pwrite()?
No, userspace should not have to wrok around broken hardware. The
kernel needs to blacklist/quirk this device so that it will do
either:

a) redirect to a zeroing mechanism that actually works on that
device; or

b) fail the fallocate() call with -EOPNOTSUPP so that the
application can fall back to manual zeroing.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help