Re: [GIT PATCH 0/2] external-metadata recovery checkpointing for 2.6.33

From: Dan Williams <hidden>
Date: 2009-12-15 04:19:38

On Mon, Dec 14, 2009 at 5:37 PM, Dan Williams [off-list ref] wrote:

On Sun, 2009-12-13 at 21:07 -0700, Neil Brown wrote:

quoted

+static ssize_t recovery_start_store(mdk_rdev_t *rdev, const char *buf, size_t len)
+{
+     unsigned long long recovery_start;
+
+     if (cmd_match(buf, "none"))
+             recovery_start = MaxSector;
+     else if (strict_strtoull(buf, 10, &recovery_start))
+             return -EINVAL;
+
+     if (rdev->mddev->pers &&
+         rdev->raid_disk >= 0)
+             return -EBUSY;

Ok, I had a chance to test this out and have a question about how you
envisioned mdmon handling this restriction which is a bit tighter than
what I had before.  The prior version allowed updates as long as the
array was read-only.  This version forces recovery_start to be written
at sysfs_add_disk() time (before 'slot' is written). The conceptual
problem I ran into was a race between ->activate_spare() determining the
last valid checkpoint and the monitor thread starting up the array:

->activate_spare(): read recovery checkpoint
( array becomes read/write )
( array becomes dirty, checkpoint invalidated )
sysfs_add_disk(): write invalid recovery checkpoint
( recovery starts from the wrong location )

On second thought, if we get to activate_spare() it's already too
late.  Moving this to mdadm at assembly time (prior to setting
readonly) is a better approach.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help