Thread (21 messages) 21 messages, 2 authors, 2017-02-28

Re: [PATCH kernel v5 10/10] KVM: PPC: VFIO: Add in-kernel acceleration for VFIO

From: David Gibson <hidden>
Date: 2017-02-28 01:19:52
Also in: kvm

On Mon, Feb 27, 2017 at 02:20:13PM +1100, Alexey Kardashevskiy wrote:
On 27/02/17 12:53, David Gibson wrote:
quoted
On Fri, Feb 24, 2017 at 02:43:05PM +1100, Alexey Kardashevskiy wrote:
quoted
On 24/02/17 14:36, David Gibson wrote:
quoted
On Fri, Feb 24, 2017 at 02:29:14PM +1100, Alexey Kardashevskiy wrote:
quoted
On 24/02/17 13:14, David Gibson wrote:
quoted
On Wed, Feb 22, 2017 at 07:21:33PM +1100, Alexey Kardashevskiy
wrote:
[snip]
quoted
quoted
quoted
quoted
quoted
+static long kvmppc_rm_tce_iommu_unmap(struct kvm *kvm,
+		struct iommu_table *tbl, unsigned long entry)
+{
+	enum dma_data_direction dir = DMA_NONE;
+	unsigned long hpa = 0;
+	long ret;
+
+	if (iommu_tce_xchg_rm(tbl, entry, &hpa, &dir))
+		return H_HARDWARE;
To avoid a double WARN() (and to make the warnings easier to
understand) I'd suggest putting a WARN_ON() here, rather than in the
callers when they receieve an H_HARDWARE.  IIUC this really shouldn't
ever happen, and it certainly can't be the guest's fault?

Makes sense.
I guess it might want WARN_ON_ONCE() to avoid spamming the user with
errors for every TCE, though.

We do not expect this to happen at all :) I can convert all of them to
_ONCE really as the purpose of WARN_ON is mostly to document what we do not
expect.
Sure, seems reasonable.

[snip]
quoted
quoted
quoted
quoted
quoted
@@ -220,9 +338,10 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
 {
 	struct kvmppc_spapr_tce_table *stt;
 	long i, ret = H_SUCCESS;
-	unsigned long tces, entry, ua = 0;
+	unsigned long tces, entry, ua = 0, tce, gpa;
 	unsigned long *rmap = NULL;
 	bool prereg = false;
+	struct kvmppc_spapr_tce_iommu_table *stit;
 
 	stt = kvmppc_find_table(vcpu->kvm, liobn);
 	if (!stt)
@@ -287,12 +406,24 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
 	}
 
 	for (i = 0; i < npages; ++i) {
-		unsigned long tce = be64_to_cpu(((u64 *)tces)[i]);
+		tce = be64_to_cpu(((u64 *)tces)[i]);
 
 		ret = kvmppc_tce_validate(stt, tce);
 		if (ret != H_SUCCESS)
 			goto unlock_exit;
 
+		gpa = tce & ~(TCE_PCI_READ | TCE_PCI_WRITE);
+
+		list_for_each_entry_lockless(stit, &stt->iommu_tables, next) {
+			ret = kvmppc_rm_tce_iommu_map(vcpu->kvm,
+					stit->tbl, entry + i, gpa,
+					iommu_tce_direction(tce));
+			if (WARN_ON_ONCE(ret == H_HARDWARE))
I don't think you need the WARN() here - the only H_HARDWARE failure
path in iommu_map() already includes a WARN().

True, I can drop it here.

quoted
quoted
+				kvmppc_rm_clear_tce(stit->tbl, entry);
+			else if (ret != H_SUCCESS)
+				goto unlock_exit;
It's also not clear to me why the H_HARDWARE error path clears the
entry, but the other failure paths don't.  Or why an H_HARDWARE will
result in continuing to set the rest of the TCEs, but other failures
won't.

The idea was that other failures still have some chance that handling may
succeed in virtual mode or via QEMU, H_HARDWARE is fatal.
Um... yes.. but the logic seems to be backwards for that: on
H_HARDWARE you warn and keep going, on other errors you bail out
entirely.
By "fatal" I means fatal for this particular hardware TCE(s), no hope in
trying this particular TCE in virtual mode.
Ok... still not following why that means the "fatal" error results in
continuing to attempt for the rest of the updated TCEs, whereas the
"non fatal" one bails out.
I was applying the principle that if after all checks done we still cannot
update the hardware table, then just clear the TCE and move on. Or I
misunderstood the idea?
*Still* not seeing why if we cannot update the hardware table we keep
trying with the rest of the entries, but on other failures we don't.
quoted
Especially since the bail out will only go
to virtual mode if ret == H_TOO_HARD, which it isn't clear is the only
possibility.

H_TOO_HARD goes to virtual mode, H_TOO_HARD in virtual goes to the
userspace (QEMU).

Will "if (WARN_ON_ONCE(ret != H_SUCCESS && ret != H_TOO_HARD))" make more
sense?
Probably, but depends what's in the if.
quoted
quoted
quoted
quoted
I am just not sure if H_PARAMETER is what I want to return at [1], to make
the calling code simplier, I could return H_HARDWARE there as well (instead
of H_PARAMETER).
That sounds right, IIUC the gpa to ua translation shouldn't ever
fail because of something the guest did. 

The guest can easily pass bad TCE/GPA which is not in any registered slot.
So it is rather H_PARAMETER.
Ah, yes.
quoted
quoted
So I'd expect either
H_HARDWARE, or H_TOO_HARD (if there's some hope that virtual mode can
make the translation when real mode couldn't).
No, virtual mode uses the exact same helper.
Ok.



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help