FW: change in disk failure policy for non-BBL arrays?
From: Chris Walker <hidden>
Date: 2017-11-03 19:58:28
Hello,
I was looking at this again today and it appears that with this change, error handling no longer works correctly in RAID10 (I haven't checked the other levels yet). Without a BBL configured, an error cycles through fix_read_error until max_read_errors is exceeded, and only then is the drive kicked out of the array. For example, if I inject errors in response to both read and write commands at sector 16392 of /dev/sda, logs in response to a read of the corresponding md0 sector look like:
(many repeats)
Oct 27 16:15:16 c1 kernel: md/raid10:md0: unable to read back corrected sectors (8 sectors at 16392 on sda)
Oct 27 16:15:16 c1 kernel: md/raid10:md0: sda: failing drive
Oct 27 16:15:16 c1 kernel: md/raid10:md0: read correction write failed (8 sectors at 16392 on sda)
Oct 27 16:15:16 c1 kernel: md/raid10:md0: sda: failing drive
Oct 27 16:15:16 c1 kernel: md/raid10:md0: unable to read back corrected sectors (8 sectors at 16392 on sda)
Oct 27 16:15:16 c1 kernel: md/raid10:md0: sda: failing drive
Oct 27 16:15:16 c1 kernel: md/raid10:md0: sda: Raid device exceeded read_error threshold [cur 21:max 20]
Oct 27 16:15:16 c1 kernel: md/raid10:md0: sda: Failing raid device
Oct 27 16:15:16 c1 kernel: md/raid10:md0: Disk failure on sda, disabling device.
Previously, the drive would have been failed out of the array by the call of md_error at the end of r10_sync_page_io.
Is there an appetite for a patch that takes the easy way out by reverting to the previous behavior with changes like
- if (!rdev_set_badblocks(rdev, sector, sectors, 0))
+ if (!rdev_set_badblocks(rdev, sector, sectors, 0) || rdev->badblocks.shift < 0)
Thanks,
Chris
On 10/23/17, 5:23 PM, "Chris Walker" [off-list ref] wrote:
Hello,
We've noticed that for an array on which the bad block list has been
disabled, a failed write from a 'check' operation no longer causes the
offending disk to be failed out of the array. As far as I can tell,
this behavior changed with commit
https://github.com/torvalds/linux/commit/fc974ee2bffdde47d1e4b220cf326952cc2c4794,
which adopted the block layer badblocks code and deprecated the
MD-specific code.
It looks like this commit changed underlying code that adds a range of
bad blocks to the BB table (md_set_badblocks --> badblocks_set) such
that the sense of the return code reversed, from 0 meaning an error
occurred to 0 meaning success, but the return code due to a disabled BB
was left at 0. With this change, therefore, for arrays without a BBL,
calls to 'rdev_set_badblocks' changed from always a failure to always a
success, and code such as
if (rdev_set_badblocks(
rdev,
r10_bio->devs[m].addr,
r10_bio->sectors, 0))
md_error(conf->mddev, rdev);
that previously would have failed the disk no longer do. Was this
change in policy deliberate?
Thanks,
Chris