Re: mismatch_cnt constantly goes up on ssd+hdd raid1
From: tlknv <hidden>
Date: 2015-06-25 15:33:16
Neil, Thanks a lot for all the info and steps to identify the problem. I have just discovered that I had 'discard' mount option even though I though it wasn't there :-( After removing 'discard' and forcing 'repair' mismatch_cnt stays 0 even after a bunch of writes and deletes (the most importantly) to the partition. BTW, what are the units in mismatch_cnt? Is it 512 sectors or something else? AFAIU md could potentially collect info on trimmed sectors/blocks and exclude them from mismatch checking. Could not it? I'll look at the range of the sectors which are different even when mismatch_cnt is 0. Thanks again, Boris 25.06.2015, 10:25, "NeilBrown" [off-list ref]:
On Thu, 25 Jun 2015 10:19:59 +0500 Roman Mamedov [off-list ref] wrote:quoted
On Thu, 25 Jun 2015 11:33:35 +1000 NeilBrown [off-list ref] wrote: > On Sun, 14 Jun 2015 20:13:16 +0300 tlknv [off-list ref] wrote: > > > Hello, > > > I have raid 1 which mirrors a root/boot partition on 1SSD and 2HDD > > (write-mostly). mismatch_cnt goes up even when there are very few > > writes to the partition as /var is mounted separatly. After I update > > several packages I typically see mismatch_cnt somewhere between > > 500,000 and 2,000,000. I have read a number of threads in this DL > > but could not find an explanation of what could cause mismatch_cnt > > to grow that much. I checked md5 sums using > > /var/lib/dpkg/info/*.md5sums, and didn't see many errors, even > > though there are few, mostly in text files which look ok to me. I > > guess when I check, all reads go to SSD (as both HDDs in this raid > > are write-mostly), and thus md5sum only shows no problem on > > SSD. Note, this partition is used as both boot and root and just in > > case here is some more info about my system: > > This does surprise me. > > I had another look at the code and there could be a bug that would let > 'check' see the difference between when the first write completes and > when the write-behind writes complete, but you would need to run the > check while the install was happening for that to be noticed, and even > then you would need to be unlucky. Couldn't this be simply the normal observed effect of using TRIM on SSD?Yes, of course it could. I try not to think about TRIM to much - makes me ill :-) Thanks, NeilBrownquoted
After deleting some files, the filesystem issues a discard request, it does nothing to the HDDs, but the content of the discared areas on SSD is no longer deterministic (or mostly zeroed, as mentioned in the original report). So there is now a mismatch between the content of HDDs and SSD, but since it is in the area of deleted files, it doesn't affect the system in any way.
-- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html