Thread (23 messages) 23 messages, 7 authors, 2021-05-03

RE: [RFC 1/2] vfio/pci: keep the prefetchable attribute of a BAR region in VMA

From: Vikram Sethi <hidden>
Date: 2021-05-02 17:56:37
Also in: kvm, kvmarm, lkml

Hi Marc, 
From: Marc Zyngier <maz@kernel.org>
Hi Vikram,
 
The problem I see is that we have VM and userspace being written in terms
of Write-Combine, which is:

- loosely defined even on x86

- subject to interpretations in the way it maps to PCI

- has no direct equivalent in the ARMv8 collection of memory
  attributes (and Normal_NC comes with speculation capabilities which
  strikes me as extremely undesirable on arbitrary devices)
If speculation with Normal NC to prefetchable BARs in devices was a problem, 
those devices would already be broken in baremetal with ioremap_wc on arm64, 
and we would need quirks there to not do Normal NC for them but Device GRE, 
and if such a quirk was needed on baremetal, it could be picked up by vfio/KVM
as well. But we haven't seen any broken devices doing wc on baremetal on ARM64, have we?
I know we have tested NICs write combining on arm64 in baremetal, as well as GPU
and NVMe CMB without issues.

Further, I don't see why speculation to non cacheble would be an issue if prefetch 
without side effects is allowed by the device, which is what a prefetchable BAR is. 
If it is an issue for a device I would consider that a bug already needing a quirk in
Baremetal/host kernel already. 
From PCI spec " A prefetchable address range may have write side effects, 
but it may not have read side effects."
How do we translate this into something consistent? I'd like to see an actual
description of what we *really* expect from WC on prefetchable PCI regions,
turn that into a documented definition agreed across architectures, and then
we can look at implementing it with one memory type or another on arm64.

Because once we expose that memory type at S2 for KVM guests, it
becomes ABI and there is no turning back. So I want to get it right once and
for all.
I agree that we need a precise definition for the Linux ioremap_wc API wrt what
drivers (kernel and userspace) can expect and whether memset/memcpy is expected
to work or not and whether aligned accesses are a requirement. 
To the extent ABI is set, I would think that the ABI is also already set in the host kernel 
for arm64 WC = Normal NC, so why should that not also be the ABI for same driver in VMs.
Thanks,

        M.

--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help