Thread (23 messages) 23 messages, 2 authors, 2016-10-26

Re: [PATCH v5 7/8] md/r5cache: r5c recovery

From: NeilBrown <hidden>
Date: 2016-10-19 02:32:48

On Thu, Oct 13 2016, Song Liu wrote:
This is the recovery part of raid5-cache.

With cache feature, there are 2 different scenarios of recovery:
1. Data-Parity stripe: a stripe with complete parity in journal.
2. Data-Only stripe: a stripe with only data in journal (or partial
   parity).

The code differentiate Data-Parity stripe from Data-Only stripe with
flag (STRIPE_R5C_WRITTEN).

For Data-Parity stripes, we use the same procedure as raid5 journal,
where all the data and parity are replayed to the RAID devices.

For Data-Only strips, we need to finish complete calculate parity and
finish the full reconstruct write or RMW write. For simplicity, in
the recovery, we load the stripe to stripe cache. Once the array is
started, the stripe cache state machine will handle these stripes
through normal write path.

r5c_recovery_flush_log contains the main procedure of recovery. The
recovery code first scans through the journal and loads data to
stripe cache. The code keeps tracks of all these stripes in a list
(use sh->lru and ctx->cached_list), stripes in the list are
organized in the order of its first appearance on the journal.
During the scan, the recovery code assesses each stripe as
Data-Parity or Data-Only.

During scan, the array may run out of stripe cache. In these cases,
the recovery code will also call raid5_set_cache_size to increase
stripe cache size.
What if this fails.  Maybe the array was created on a machine with lots
or memory, but that machine died and you are trying to recovery you data
on a much less capable machine.

I don't think there is an easy answer, but at least you need to fail
gracefully.

I'll try to look at the rest tomorrow.

NeilBrown

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help