Thread (19 messages) 19 messages, 8 authors, 2017-05-09

Re: RAID creation resync behaviors

From: NeilBrown <hidden>
Date: 2017-05-09 20:30:57

On Tue, May 09 2017, Jes Sorensen wrote:
On 05/03/2017 10:04 PM, Shaohua Li wrote:
quoted
On Thu, May 04, 2017 at 11:07:01AM +1000, Neil Brown wrote:
quoted
On Wed, May 03 2017, Shaohua Li wrote:
quoted
Hi,

Currently we have different resync behaviors in array creation.

- raid1: copy data from disk 0 to disk 1 (overwrite)
- raid10: read both disks, compare and write if there is difference (compare-write)
- raid4/5: read first n-1 disks, calculate parity and then write parity to the last disk (overwrite)
- raid6: read all disks, calculate parity and compare, and write if there is difference (compare-write)
The approach taken for raid1 and raid4/5 provides the fastest sync for
an array built on uninitialised spinning devices.
RAID6 could use the same approach but would involve more CPU and so
the original author of the RAID6 code (hpa) chose to go for the low-CPU
cost option.  I don't know if tests were done, or if they would still be
valid on new hardware.
The raid10 approach comes from "it is too hard to optimize in general
because different RAID10 layouts have different trade-offs, so just
take the easy way out."
ok, thanks for the explanation!
quoted
quoted
Write whole disk is very unfriendly for SSD, because it reduces lifetime. And
if user already does a trim before creation, the unncessary write could make
SSD slower in the future. Could we prefer compare-write to overwrite if mdadm
detects the disks are SSD? Surely sometimes compare-write is slower than
overwrite, so maybe add new option in mdadm. An option to let mdadm trim SSD
before creation sounds reasonable too.
An option to ask mdadm to trim the data space and then --assume-clean
certainly sounds reasonable.
This doesn't work well. read returns 0 for trimmed data space in some SSDs, but
not all. If not, we will have trouble.
/sys/block/<device>/queue/discard_zeroes_data

We could use this as an indicator for what to do.
According to

Documentation/ABI/testing/sysfs-block

Description:
                Will always return 0.  Don't rely on any specific behavior
                for discards, and don't read this file.

See also
 Commit: 48920ff2a5a9 ("block: remove the discard_zeroes_data flag")

NeilBrown

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help