Re: [PATCH v3] cxl: mask slice error interrupts after first occurrence
From: Vaibhav Jain <hidden>
Date: 2017-04-28 06:38:56
Hi Alastair, Thanks for addressing previous review comments. Few additional and very minor comments. Alastair D'Silva [off-list ref] writes:
From: Alastair D'Silva <redacted> In some situations, a faulty AFU slice may create an interrupt storm,
'interrupt storm of slice-errors,'
rendering the machine unusable. Since these interrupts are informational only, present the interrupt once, then mask it off to prevent it from being retriggered until the card is reset.
s|card|card/afu
quoted hunk ↗ jump to hunk
@@ -1226,7 +1237,11 @@ static irqreturn_t native_slice_irq_err(int irq, void *data) dev_crit(&afu->dev, "AFU_ERR_An: 0x%.16llx\n", afu_error); dev_crit(&afu->dev, "PSL_DSISR_An: 0x%.16llx\n", dsisr); + /* mask off the IRQ so it won't retrigger until the card is reset */ + irq_mask = (serr & CXL_PSL_SERR_An_IRQS) >> 32; + serr |= irq_mask; cxl_p1n_write(afu, CXL_PSL_SERR_An, serr); + dev_info(&afu->dev, "Further interrupts will be masked until the
Optional: Just to be explicit, since you are only masking a subset of possible slice errors hence I would suggest rephrasing the message as: "Further such interrupts....
AFU is reset\n");
To be consistent with the patch description s|AFU|AFU/Card -- Vaibhav Jain [off-list ref] Linux Technology Center, IBM India Pvt. Ltd.