Thread (28 messages) 28 messages, 4 authors, 2018-12-23

Re: [kernel, v6, 01/20] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

From: Michael Ellerman <hidden>
Date: 2018-12-23 13:57:52

On Wed, 2018-12-19 at 08:52:13 UTC, Alexey Kardashevskiy wrote:
The skiboot firmware has a hot reset handler which fences the NVIDIA V100
GPU RAM on Witherspoons and makes accesses no-op instead of throwing HMIs:
https://github.com/open-power/skiboot/commit/fca2b2b839a67

Now we are going to pass V100 via VFIO which most certainly involves
KVM guests which are often terminated without getting a chance to offline
GPU RAM so we end up with a running machine with misconfigured memory.
Accessing this memory produces hardware management interrupts (HMI)
which bring the host down.

To suppress HMIs, this wires up this hot reset hook to vfio_pci_disable()
via pci_disable_device() which switches NPU2 to a safe mode and prevents
HMIs.

Signed-off-by: Alexey Kardashevskiy <redacted>
Acked-by: Alistair Popple <redacted>
Reviewed-by: David Gibson <redacted>
Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ab7032e793f9ad799ca2692046fba5

cheers
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help