Re: Adventures in btrfs raid5 disk recovery

Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-20
Re: Adventures in btrfs raid5 disk recovery · Roman Mamedov <hidden> · 2016-06-20
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-20
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-20
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-20
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-20
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-21
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-21
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-22
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-22
Re: Adventures in btrfs raid5 disk recovery · Goffredo Baroncelli <hidden> · 2016-06-23
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Andrei Borzenkov <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Hugo Mills <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Andrei Borzenkov <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Hugo Mills <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Andrei Borzenkov <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Hugo Mills <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Austin S. Hemmelgarn <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Andrei Borzenkov <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Austin S. Hemmelgarn <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-25
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-25
Re: Adventures in btrfs raid5 disk recovery · Andrei Borzenkov <hidden> · 2016-06-26
Re: Adventures in btrfs raid5 disk recovery · Duncan <hidden> · 2016-06-26
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-26
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-26
Re: Adventures in btrfs raid5 disk recovery · Austin S. Hemmelgarn <hidden> · 2016-06-27
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-27
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-27
Re: Adventures in btrfs raid5 disk recovery · Henk Slager <hidden> · 2016-06-27
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-27
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-27
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-28
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-28
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-28
Re: Adventures in btrfs raid5 disk recovery · Austin S. Hemmelgarn <hidden> · 2016-06-28
Re: Adventures in btrfs raid5 disk recovery · Austin S. Hemmelgarn <hidden> · 2016-06-28
Re: Adventures in btrfs raid5 disk recovery · Steven Haigh <hidden> · 2016-06-28
Re: Adventures in btrfs raid5 disk recovery · Austin S. Hemmelgarn <hidden> · 2016-06-28
Re: Adventures in btrfs raid5 disk recovery · Steven Haigh <hidden> · 2016-06-28
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-28
Re: Adventures in btrfs raid5 disk recovery · Steven Haigh <hidden> · 2016-06-28
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-07-05
Re: Adventures in btrfs raid5 disk recovery · Austin S. Hemmelgarn <hidden> · 2016-07-06
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-07-06
Re: Adventures in btrfs raid5 disk recovery · Austin S. Hemmelgarn <hidden> · 2016-07-06
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-07-06
Re: Adventures in btrfs raid5 disk recovery · Austin S. Hemmelgarn <hidden> · 2016-07-06
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-07-06
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Hugo Mills <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-23
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Andrei Borzenkov <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Chris Murphy <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Austin S. Hemmelgarn <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery · Zygo Blaxell <hidden> · 2016-06-24
Re: Adventures in btrfs raid5 disk recovery - update · Zygo Blaxell <hidden> · 2016-06-22

From: Chris Murphy <hidden>
Date: 2016-06-25 16:44:54

On Fri, Jun 24, 2016 at 12:19 PM, Austin S. Hemmelgarn
[off-list ref] wrote:

Well, the obvious major advantage that comes to mind for me to checksumming
parity is that it would let us scrub the parity data itself and verify it.

OK but hold on. During scrub, it should read data, compute checksums
*and* parity, and compare those to what's on-disk - > EXTENT_CSUM in
the checksum tree, and the parity strip in the chunk tree. And if
parity is wrong, then it should be replaced.

Even check > md/sync_action does this. So no pun intended but Btrfs
isn't even at parity with mdadm on data integrity if it doesn't check
if the parity matches data.

I'd personally much rather know my parity is bad before I need to use it
than after using it to reconstruct data and getting an error there, and I'd
be willing to be that most seasoned sysadmins working for companies using
big storage arrays likely feel the same about it.

That doesn't require parity csums though. It just requires computing
parity during a scrub and comparing it to the parity on disk to make
sure they're the same. If they aren't, assuming no other error for
that full stripe read, then the parity block is replaced.

So that's also something to check in the code or poke a system with a
stick and see what happens.

I could see it being
practical to have an option to turn this off for performance reasons or
similar, but again, I have a feeling that most people would rather be able
to check if a rebuild will eat data before trying to rebuild (depending on
the situation in such a case, it will sometimes just make more sense to nuke
the array and restore from a backup instead of spending time waiting for it
to rebuild).

The much bigger problem we have right now that affects Btrfs,
LVM/mdadm md raid, is this silly bad default with non-enterprise
drives having no configurable SCT ERC, with ensuing long recovery
times, and the kernel SCSI command timer at 30 seconds - which
actually also fucks over regular single disk users also because it
means they don't get the "benefit" of long recovery times, which is
the whole g'd point of that feature. This itself causes so many
problems where bad sectors just get worse and don't get fixed up
because of all the link resets. So I still think it's a bullshit
default kernel side because it pretty much affects the majority use
case, it is only a non-problem with proprietary hardware raid, and
software raid using enterprise (or NAS specific) drives that already
have short recovery times by default.

This has been true for a very long time, maybe a decade. And it's such
complete utter crap that this hasn't been dealt with properly by any
party. No distribution has fixed this for their users. Upstream udev
hasn't dealt with it. And kernel folks haven't dealt with it. It's a
perverse joke on the user to do this out of the box.



-- 
Chris Murphy

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help