Thread (48 messages) 48 messages, 6 authors, 2009-11-05

Re: [PATCH 14/27] Add book3s_64 specific opcode emulation

From: Alexander Graf <hidden>
Date: 2009-11-05 10:09:48
Also in: kvm

On 05.11.2009, at 01:53, Segher Boessenkool wrote:
quoted
quoted
quoted
+		case OP_31_XOP_EIOIO:
+			break;
Have you always executed an eieio or sync when you get here, or
do you just not allow direct access to I/O devices?  Other context
synchronising insns are not enough, they do not broadcast on the
bus.
There is no device passthrough yet :-). It's theoretically  
possible, but nothing for it is implemented so far.
You could just always do an eieio here, it's not expensive at all
compared to the emulation trap itself.

However -- eieio is a Book II insn, it will never trap anyway!
Don't all 31 ops trap? I'm pretty sure I added the emulation because I  
saw the trap.
quoted
quoted
quoted
+		case OP_31_XOP_DCBZ:
+		{
+			ulong rb =  vcpu->arch.gpr[get_rb(inst)];
+			ulong ra = 0;
+			ulong addr;
+			u32 zeros[8] = { 0, 0, 0, 0, 0, 0, 0, 0 };
+
+			if (get_ra(inst))
+				ra = vcpu->arch.gpr[get_ra(inst)];
+
+			addr = (ra + rb) & ~31ULL;
+			if (!(vcpu->arch.msr & MSR_SF))
+				addr &= 0xffffffff;
+
+			if (kvmppc_st(vcpu, addr, 32, zeros)) {
DCBZ zeroes out a cache line, not 32 bytes; except on 970, where  
there
are HID bits to make it work on 32 bytes only, and an extra DCBZL  
insn
that always clears a full cache line (128 bytes).
Yes. We only come here when we patched the dcbz opcodes to invalid  
instructions
Ah yes, I forgot.  Could you rename it to OP_31_XOP_FAKE_32BIT_DCBZ
or such?
Good idea.
quoted
because cache line size of target == 32.
On 970 with MSR_HV = 0 we actually use the dcbz 32-bytes mode.

Admittedly though, this could be a lot more clever.
quoted
quoted
quoted
+		/* guest HID5 set can change is_dcbz32 */
+		if (vcpu->arch.mmu.is_dcbz32(vcpu) &&
+		    (mfmsr() & MSR_HV))
+			vcpu->arch.hflags |= BOOK3S_HFLAG_DCBZ32;
+		break;
Wait, does this mean you allow other HID writes when MSR[HV] isn't
set?  All HIDs (and many other SPRs) cannot be read or written in
supervisor mode.
When we're running in MSR_HV=0 mode on a 970 we can use the 32 byte  
dcbz HID flag. So all we need to do is tell our entry/exit code to  
set this bit.
Which patch contains that entry/exit code?
That's patch 7 / 27.

+	/* Some guests may need to have dcbz set to 32 byte length.
+	 *
+	 * Usually we ensure that by patching the guest's instructions
+	 * to trap on dcbz and emulate it in the hypervisor.
+	 *
+	 * If we can, we should tell the CPU to use 32 byte dcbz though,
+	 * because that's a lot faster.
+	 */
+
+	ld	r3, VCPU_HFLAGS(r4)
+	rldicl.	r3, r3, 0, 63		/* CR = ((r3 & 1) == 0) */
+	beq	no_dcbz32_on
+
+	mfspr   r3,SPRN_HID5
+	ori     r3, r3, 0x80		/* XXX HID5_dcbz32 = 0x80 */
+	mtspr   SPRN_HID5,r3
+
+no_dcbz32_on:
quoted
If we're on 970 on a hypervisor or on a non-970 though we can't use  
the HID5 bit, so we need to binary patch the opcodes.

So in order to emulate real 970 behavior, we need to be able to  
emulate that HID5 bit too! That's what this chunk of code does - it  
basically sets us in dcbz32 mode when allowed on 970 guests.
But when MSR[HV]=0 and MSR[PR]=0, mtspr to a hypervisor resource
will not trap but be silently ignored.  Sorry for not being more  
clear.
...Oh.  You run your guest as MSR[PR]=1 anyway!  Tricky.
Yeah, the guest is always running in PR=1, so all HV checks are for  
the host. Usually we run in HV=1 on the host, because IBM doesn't sell  
machines that have HV=0 accessible for mortals :-).


I'll address your comments in a follow-up patch once the stuff is  
merged.

Alex
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help