Re: Huge values of mismatch_cnt on RAID 6 arrays under Fedora 18
From: Piergiorgio Sartor <hidden>
Date: 2013-01-28 19:18:25
On Mon, Jan 28, 2013 at 08:00:35PM +0100, Wolfgang Denk wrote:
Dear Piergiorgio, In message [ref] you wrote:quoted
I would shamelessly suggest to try "raid6check", in order to see if some components have problems. The software is somehow buried into "mdadm" source code, probably you'll need to take it from the repository.Found it. Thanks for the suggestion. However, this is extreme verbose: layout: 2 disks: 8 component size: 249108103168 total stripes: 15204352 chunk size: 16384 disk: 0 - offset: 134217728 - size: 250864926720 - name: /dev/sdk1 - slot: 5 disk: 1 - offset: 134217728 - size: 250864926720 - name: /dev/sdj1 - slot: 4 disk: 2 - offset: 134217728 - size: 250864926720 - name: /dev/sdi1 - slot: 7 disk: 3 - offset: 134217728 - size: 250864926720 - name: /dev/sdh1 - slot: 3 disk: 4 - offset: 134217728 - size: 250864926720 - name: /dev/sdg1 - slot: 2 disk: 5 - offset: 134217728 - size: 250864926720 - name: /dev/sdf1 - slot: 1 disk: 6 - offset: 134217728 - size: 250864926720 - name: /dev/sde1 - slot: 6 disk: 7 - offset: 134217728 - size: 250863844352 - name: /dev/sdd1 - slot: 0 pos --> 0 0->1 1->2 2->3 3->4 4->5 5->6 pos --> 1 0->0 1->1 2->2 3->3 4->4 5->5 pos --> 2 0->7 1->0 2->1 3->2 4->3 5->4 pos --> 3 0->6 1->7 2->0 3->1 4->2 5->3 pos --> 4 0->5 1->6 2->7 3->0 4->1 5->2 pos --> 5 ... etc. ad nauseam. I guess "pos" means stripe here, so it would print this for all stripes in the array? Does this means all of them are broken? Or what would I have to look for to see where an error mightbe?
Hi Wolfgang, the output is indeed verbose, my suggestion would be to redirect it to a file (on different storage) and "grep" later for "Error". This should report if a specific device is detected with problems or if it cannot detect which device. The output you see above means everything is correct, until stripe 4, at least. So you're right, the "pos" is the stripe position. In case of error, something like: Error detected at X: possible failed disk slot: Y Which means stripe X, disk Y, from the initial print. Or it could be: Error detected at X: disk slot unknown Which should be obvious. Hope this helps, bye, -- piergiorgio