Thread (15 messages) 15 messages, 3 authors, 2015-08-01

[PATCH v9 2/5] mtd: nand: vf610_nfc: add hardware BCH-ECC support

From: stefan@agner.ch (Stefan Agner)
Date: 2015-07-31 23:39:11
Also in: linux-devicetree, lkml

Hi Brian,

On 2015-08-01 01:09, Brian Norris wrote:
<snip>
quoted
+static inline int vf610_nfc_correct_data(struct mtd_info *mtd, uint8_t *dat)
+{
+	struct vf610_nfc *nfc = mtd_to_nfc(mtd);
+	u8 ecc_status;
+	u8 ecc_count;
+	int flip;
+
+	ecc_status = __raw_readb(nfc->regs + ECC_SRAM_ADDR * 8 + ECC_OFFSET);
+	ecc_count = ecc_status & ECC_ERR_COUNT;
+	if (!(ecc_status & ECC_STATUS_MASK))
+		return ecc_count;
+
+	/*
+	 * On an erased page, bit count should be zero or at least
+	 * less then half of the ECC strength
+	 */
+	flip = count_written_bits(dat, nfc->chip.ecc.size, ecc_count);
Sorry I didn't notice this earlier, but it appears you are falling into
the same trap that almost everyone else is -- it is not sufficient to
check just the page area; you also need to check the OOB. Suppose that
a MTD user wrote mostly-0xff data to the page, then the page accumulates
bitflips in the spare area and a few in the page area, such that
eventually HW ECC can't correct them. If there are few enough zero bits
in the data area, you will mistakenly think that this is a blank page
below, and memset() it to 0xff. That would be disastrous!

Fortunately, your code is otherwise quite well structured and looks
good. A tip below.
quoted
+
+	if (flip > ecc_count && flip > (nfc->chip.ecc.strength / 2))
+		return -1;
+
+	/* Erased page. */
+	memset(dat, 0xff, nfc->chip.ecc.size);
+	return 0;
+}
+
+static int vf610_nfc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
+				uint8_t *buf, int oob_required, int page)
+{
+	int eccsize = chip->ecc.size;
+	int stat;
+
+	vf610_nfc_read_buf(mtd, buf, eccsize);
+
+	if (oob_required)
+		vf610_nfc_read_buf(mtd, chip->oob_poi, mtd->oobsize);
To fix the bitflips issue above, you'll just want to unconditionally
read the OOB (it's fine to ignore 'oob_required') and...
quoted
+
+	stat = vf610_nfc_correct_data(mtd, buf);
...pass in chip->oob_poi as a third argument.
Hm, this probably will have an effect on performance, since we usually
omit the OOB if not requested. I could fetch the OOB from the NAND
controllers SRAM only if necessary (if HW ECC status is not ok...). Does
this sound reasonable?
quoted
+
+	if (stat < 0)
+		mtd->ecc_stats.failed++;
+	else
+		mtd->ecc_stats.corrected += stat;
You've got another problem here: ecc.read_page() should be returning
'max_bitflips' here. So, since you have a single ECC region, this block
should probably be:

	if (stat < 0) {
		mtd->ecc_stats.failed++;
		return 0;
	} else {
		mtd->ecc_stats.corrected += stat;
		return stat;
	}
Ok, will change that.

--
Stefan
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help