Re: mdadm stuck at 0% reshape after grow

From: Phil Turmel <hidden>
Date: 2017-12-06 16:21:18

On 12/06/2017 11:03 AM, Andreas Klauer wrote:

On Wed, Dec 06, 2017 at 09:15:21AM -0500, Phil Turmel wrote:

quoted

The problem with this is that the sectors currently marked don't have
appropriate data.

It might have the correct data. Depends what exactly happened.
If it happened years ago and you never noticed until reshape, 
chances are it won't matter one way or another.

No, almost certainly not the correct data.  The data that was attempted
to be written at the time the BB was added didn't make it to disk, and
any future updated data writes would be skipped since it's in the list.

Of course, it doesn't hurt to take additional steps, if you have 
backups to compare with or some other way to check file integrity.

If you check integrity before deleting the BBL, MD reconstructs the
data.  If you check integrity after deleting the BBL, MD is giving you
the garbage (because it doesn't know to reconstruct).

quoted

If you have a filesystem with bad blocks management on top of it, 
check that too and clear it if necessary.

MD's BBL system doesn't coordinate with the filesystem on top, so this
is meaningless.

MD with duped BBLs does return read errors, so it's a possibility.

No, it doesn't.  The read error is only passed to the filesystem if
there's no redundancy left for the block address.

quoted

The BBL in MD is woefully incomplete and should *never* be used.

There's ups and downs to everything. Relocations would be awful too. 
Harms performance and makes recovery all but impossible. So many people 
on this list with lost metadata, figuring out RAID layout and drive 
oder is hard, but figuring out random relocations is impossible.

There's no "up" to the existing BBL.  It isn't doing what people think.
It does NOT cause the upper layer to avoid the block address.  It just
kills redundancy at that address.

The BBL could be improved a lot if it prevented BBLs to be identical 
across drives, and gave bad blocks a second chance. Once the cable 
problem is solved, MD should help you turning those bad blocks back 
into good ones.

MD does exactly this with all modern hard drives using the drives'
built-in relocation systems.  And the write-intent bitmap/re-add feature
helps efficiently deal with writes that were missed on that device while
it was disconnected.

The only thing a BBL could actually help with on modern drives is an
exhausted on-drive relocation table, and only if the BBL was able to do
relocations itself.  Of course, by the time a drive exhausts it internal
spares, it's too far gone to trust anyways.

And if your drive actually has real bad blocks, the only correct course 
of action is to replace it entirely.

No, modern drives will attempt to fix blocks on rewrite, and will
relocate them internally if unfixable.  Precisely what you think MD's
BBL should do.  MD's BBL is creating an unfixable mess, not actually
fixing anything.

This is why I suggested using hdparm to pass the BBL data to the
underlying drive.  Then MD *will* actually fix each block.

The problem with BBL right now is 
that even if you replace all drives, the BBL stays. Once it's duplicated 
you are stuck with it forever until you forcibly remove it.

The problem with the BBL right now is its existence.

Phil

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help