Re: [PATCH V4] powerpc/85xx: Add machine check handler to fix PCIe erratum on mpc85xx
From: Stuart Yoder <hidden>
Date: 2013-03-04 16:16:13
From: Stuart Yoder <hidden>
Date: 2013-03-04 16:16:13
On Mon, Mar 4, 2013 at 2:40 AM, Jia Hongtao [off-list ref] wrote:
A PCIe erratum of mpc85xx may causes a core hang when a link of PCIe goes down. when the link goes down, Non-posted transactions issued via the ATMU requiring completion result in an instruction stall. At the same time a machine-check exception is generated to the core to allow further processing by the handler. We implements the handler which skips the instruction caused the stall.
Can you explain at a high level how just skipping an instruction solves anything? If you just skip a load/store and continue like nothing is wrong, isn't your system possibly in a really bad state. And if the core is already hung, due to the PCI link going down, isn't it too late? How does skipping help? Stuart