Re: MD software RAID1 vs suspend-to-disk
From: Daniel Pittman <hidden>
Date: 2009-03-02 03:40:30
"NeilBrown" [off-list ref] writes:
On Mon, March 2, 2009 1:23 pm, Daniel Pittman wrote:quoted
John Robinson [off-list ref] writes:quoted
On 01/03/2009 08:52, Daniel Pittman wrote:quoted
I have a random desktop machine here, running Debian/sid with a 2.6.26 Debian kernel. It has a two disk software RAID1, and apparently passes through a suspend/resume cycle correctly, but...
[...]
quoted
No, that appears to be about suspending and resuming access to the MD device while reconfiguring it; I don't /think/ that is accessed during a system-wide suspend/resume (aka hibernate, or s2disk) cycle. Certainly, it doesn't look like the path is invoked for that from my reading of the code.Correct, they are completely unrelated. I have never tried hibernating to an md array, but I think others have, though I don't have a lot of specifics. One observation is that you really don't want resync to start before the resume has completed. For this reason we have the 'start_ro' parameter. Setting that to 1, e.g echo 1 > /sys/module/md_mod/parameters/start_ro will mean that resync will not start until the first write to the array. The initrd should set this before assembling an md array to load a resume image from.
Ah. Debian already do this; see: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=415441 (Actually, since you wrote in that bug thread you already know. :) Hmmm. I have swap on LVM on MD, though, and I suspect that LVM writes to disk when it discovers and activates the volume groups... Let me try and find out. Then I can go and be grumpy, but at least complain to the right people about this. :) [...]
It should be that your observed symtpom of "check reports 48800 mismatches" has nothing to do with hibernate/resume.
OK.
Presumably you have swap on md/raid1 (as that is where hibenate writes). The nature of swap writeout is that it is entirely possible for different data to be written to each device of a raid1 when a page is swapped out. However in that case, the data will never be read back in so the apparent corruption is not a problem.
Well, that is a relief, at least.
I would recommend that you run 'repair' before hibernating, to be sure that the array is in-sync. Then hibenate/resume and see if it is still in sync. I suspect it will be.
That seems reasonable; I will test it.
Regards,
Daniel