Thread (20 messages) 20 messages, 4 authors, 2007-05-29

Re: raid5: I lost a XFS file system due to a minor IDE cable problem

From: David Chinner <hidden>
Date: 2007-05-29 03:28:03
Also in: linux-xfs

On Mon, May 28, 2007 at 05:45:27PM -0500, Alberto Alonso wrote:
On Fri, 2007-05-25 at 18:36 +1000, David Chinner wrote:
quoted
On Fri, May 25, 2007 at 12:43:51AM -0500, Alberto Alonso wrote:
quoted
I think his point was that going into a read only mode causes a
less catastrophic situation (ie. a web server can still serve
pages).
Sure - but once you've detected one corruption or had metadata
I/O errors, can you trust the rest of the filesystem?
quoted
I think that is a valid point, rather than shutting down
the file system completely, an automatic switch to where the least
disruption of service can occur is always desired.
I consider the possibility of serving out bad data (i.e after
a remount to readonly) to be the worst possible disruption of
service that can happen ;)
I guess it does depend on the nature of the failure. A write failure
on block 2000 does not imply corruption of the other 2TB of data.
The rest might not be corrupted, but if block 2000 is a index of
some sort (i.e. metadata), you could reference any of that 2TB
incorrectly and get the wrong data, write to the wrong spot on disk,
etc.
quoted
quoted
I personally have found the XFS file system to be great for
my needs (except issues with NFS interaction, where the bug report
never got answered), but that doesn't mean it can not be improved.
Got a pointer?
I can't seem to find it. I'm pretty sure I used bugzilla to report
it. I did find the kernel dump file though, so here it is:

Oct  3 15:34:07 localhost kernel: xfs_iget_core: ambiguous vns:
vp/0xd1e69c80, invp/0xc989e380
Oh, I haven't seen any of those problems for quite some time.
= /proc/kmsg started.
Oct  3 15:51:23 localhost kernel:
Inspecting /boot/System.map-2.6.8-2-686-smp
Oh, well, yes, kernels that old did have that problem. It got fixed
some time around 2.6.12 or 2.6.13 IIRC....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help