Re: raid5: I lost a XFS file system due to a minor IDE cable problem
From: Pallai Roland <hidden>
Date: 2007-05-25 14:01:27
Also in:
linux-xfs
On Friday 25 May 2007 03:35:48 Pallai Roland wrote:
On Fri, 2007-05-25 at 10:05 +1000, David Chinner wrote:quoted
On Thu, May 24, 2007 at 07:20:35AM -0400, Justin Piszcz wrote:quoted
On Thu, 24 May 2007, Pallai Roland wrote:quoted
It's a good question too, but I think the md layer could save dumb filesystems like XFS if denies writes after 2 disks are failed, and I cannot see a good reason why it's not behave this way.How is *any* filesystem supposed to know that the underlying block device has gone bad if it is not returning errors?It is returning errors, I think so. If I try to write raid5 with 2 failed disks with dd, I've got errors on the missing chunks. The difference between ext3 and XFS is that ext3 will remount to read-only on the first write error but the XFS won't, XFS only fails only the current operation, IMHO. The method of ext3 isn't perfect, but in practice, it's working well.
Sorry, I was wrong: md really isn't returning error! It's madness, IMHO. The reason why ext3 safer on raid5 in practice is that ext3 remounts to read-only on read errors too and when a raid5 array got 2 failed drives and there's some read, the error= behavior of ext3 will be activated and stops further writes. You're right, it's not a good solution and there should be read operations to prevent data loss in this case on ext3 too. Raid5 *must deny all writes* when 2 disks failed: I still can't see a good reason why not, and the current method is braindead!
quoted
I did mention this exact scenario in the filesystems workshop back in february - we'd *really* like to know if a RAID block device has gone into degraded mode (i.e. lost a disk) so we can throttle new writes until the rebuil dhas been completed. Stopping writes completely on a fatal error (like 2 lost disks in RAID5, and 3 lost disks in RAID6) would also be possible if only we could get the information out of the block layer.
Yes, it's sounds good, but I think we need a quick fix now, it's a real problem and easily can lead to mass data loss. -- d